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Abstract 


Fast  restoration  from  a  network  failure  has  been  recognized  as  a  key  ingredient  in  realizing  sur- 
vivable  networks  in  emerging  high-speed  ATM  environments.  Self-healing  techniques  have  been 
proposed  to  provide  service  continuity  to  end-users  by  autonomously  switching  affected  virtual 
paths  to  alternate  routes.  However,  their  success  greatly  depends  on  how  traffic  is  distributed  and 
spare  capacity  is  dimensioned  over  the  network  when  a  failure  occurs.  Therefore,  a  fast  restoration 
mechanism  alone  is  not  enough  to  achieve  a  high  level  of  survivability.  This  project  aims  at  defin¬ 
ing  an  integrated  framework  for  survivable  ATM  network  management  ranging  from  network 
design  and  virtual  path  management  to  fast  restoration  protocol.  The  focus  is  placed  on  two 
aspects  essential  to  promoting  restoration  capability:  the  survivable  virtual  path  routing  (SVPR) 
and  the  survivable  capacity  and  flow  assignment  (SCFA).  The  SVPR  attempts  to  maximize 
restorability  from  a  possible  network  failure  by  adjusting  virtual  path  configuration  in  response  to 
a  change  in  traffic  demand  or  facility  network  configuration.  The  two-step  restoration  concept  is 
also  introduced  to  achieve  fast  restoration  as  well  as  optimal  reconfiguration.  The  SCFA  seeks  for 
the  most  economical  physical  link  capacity  placement  in  order  to  deploy  a  survivable  network  in  a 
cost-effective  manner.  Joint  optimization  is  carried  out  to  find  a  globally  optimal  solution.  The 
problems  for  maximum-flow  restoration-based  networks  are  formulated  as  a  large-scale  linear  pro¬ 
gramming  model.  Several  mechanisms  are  devised  to  reduce  computation  time  by  facilitating  the 
basis  matrix  factorization.  The  SVPR  for  k-shortest  path  restoration-based  networks  is  modeled  as 
a  non-smooth  multicommodity  flow  problem  with  linear  constraints.  A  modified  flow  deviation 
method  is  developed  to  obtain  a  near-optimal  solution  in  a  reasonable  computation  time.  Prema¬ 
ture  convergence  to  a  non-smooth  point  is  avoided  by  properly  adjusting  optimization  parameters. 
Using  the  proposed  optimization  procedures,  the  minimum  resource  installation  cost  is  quantita¬ 
tively  analyzed  for  different  fast  restoration  schemes.  Contrary  to  a  wide  belief  in  the  economic 
advantage  of  the  end-to-end  restoration  scheme,  this  study  reveals  that  the  attainable  gain  could  be 
marginal  for  a  well-connected  and/or  unbalanced  network. 


I  Introduction 


1.1.  Problem  Statement 

Survivability  has  become  a  critical  issue  in  telecommunication  networks  due  to  increasing  soci¬ 
etal  dependence  on  communication  systems  and  the  growing  importance  of  information.  The  sig¬ 
nificance  has  been  further  magnified  by  the  advent  of  high-capacity  optical  fiber:  a  2.4  Gbps 
optical  transmission  system  over  a  single  wavelength  is  already  available,  with  an  increased  rate 
beyond  10  Gbps  projected  for  the  near  fumre  [56].  Using  the  wavelength  division  multiplexing 
techniques,  many  channels  of  different  wavelengths  can  be  multiplexed  on  a  single  optical  fiber, 
and  each  optical  tmnk  is  expected  to  support  a  data  rate  ranging  from  a  hundred  Gbps  to  Tbps  [34] 
[56]  [66].  In  order  to  utilize  the  high  transmission  capacity  of  optical  fiber  cables  and  reduce  the 
transmission  cost,  the  topology  of  wide-area  networks  is  becoming  sparse,  and  data  streams  are 
aggregated  into  relatively  few  optical  fiber  cables  [55]  [56].  A  large  concentration  of  transmission 
capacity  in  the  exchange  networks  would  considerably  increase  the  vulnerability  of  the  network 
and  reduce  the  robustness  of  the  network  to  a  transmission  component  failure.  In  this  environment, 
a  network  failure  such  as  an  optical  link  impairment  or  a  node  failure  can  cause  a  large  loss  of  data, 
even  in  a  short  outage.  Thus,  it  is  imperative  to  make  the  service  interruption  time  as  short  as  pos¬ 
sible,  and  a  fast  restoration  from  a  network  failure  has  become  one  of  the  key  survivability  issues 
for  future  high-speed  networks. 

In  order  to  mitigate  the  impact  of  a  network  failure,  self-healing  techniques  have  been  developed 
for  mesh-type  long-haul  or  metropolitan  exchange  networks  based  on  Synchronous  Transfer  Mode 
(STM)  [2]  [35]  [80]  [82]  as  well  as  Asynchronous  Transfer  Mode  (ATM)  technologies  [5]  [43] 
(see  Section  2.1.2  for  details).  Their  main  goal  is  to  provide  service  continuity  so  that  a  network 
failure  is  almost  imperceptible  to  end-users.  For  this  purpose,  fast  restoration  schemes^  autono¬ 
mously  recover  affected  traffic  in  a  short  period  after  a  failure  so  that  the  existing  connections 
would  not  get  dropped^.  In  ATM  networks,  virtual  paths^  passing  over  an  impaired  network  com¬ 
ponent  are  rerouted  upon  failure.  A  fast  virtual  path  (VP)  restoration  scheme  dispatches  the  spare 


1.  The  terms  “self-healing”  and  “fast  restoration”  are  used  interchangeably. 

2.  The  restoration  should  finish  before  the  outage  duration  reaches  call-dropping  threshold  or  before  data  protocol  time¬ 
out  happens  [35].  Most  existing  services  become  impacted  by  a  2  to  10  second  service  outage  [75].  Thus,  fast  restoration 
algorithms  typically  aim  at  recovering  affected  transport  paths  in  2  seconds  [36]  [80]  [83]. 

3.  A  virtual  path  (VP)  is  a  semi-permanent  logical  transport  path  where  multiple  calls  (connections)  are  accommodated 
(see  Section  2.1.1  for  details). 
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bandwidth  required  for  VP  rerouting  and  switches  the  affected  VP’s  over  the  discovered  alternate 
routes. 

The  primary  concern  in  previous  works  was  the  development  of  a  fast  restoration  protocol  which 
accelerates  the  restoration  procedures  (restoration  speed)  and  recovers  affected  bandwidth  as 
much  as  possible  (restoration  ratio  or  restorability).  The  proposed  algorithms  have  been  evaluated 
with  respect  to  these  two  measures  through  computer  simulation.  A  few  sample  networks  are 
employed  in  the  evaluation  where  link  flow  and  spare  bandwidth  assignments  are  predetermined. 

With  dynamically  changing  traffic  demand,  however,  flow  assignment  varies  over  time,  and  the 
amount  of  link  spare  capacity  changes  accordingly.  But  flow  assignment  and  spare  bandwidth  allo¬ 
cation  have  a  direct  influence  on  the  restorability  of  a  fast  restoration  scheme  because  the  flow 
assignment  determines  the  amount  of  bandwidth  to  be  recovered  when  a  network  failure  occurs, 
and  spare  bandwidth  is  utilized  for  the  restoration.  Thus,  in  order  to  offer  better  network  surviv¬ 
ability,  it  is  crucial  for  a  network  manager  to  realize  restorable  flow  assignment  in  response  to  a 
change  in  traffic  demand  as  well  as  facility  network  configuration. 

Physical  link  capacity  allocation  also  has  a  significant  influence  on  the  restorability  because  it 
determines  a  set  of  feasible  flow  and  spare  bandwidth  assignments  for  a  given  traffic  demand.  Full 
restorability  against  major  network  failures  could  be  guaranteed  by  installing  a  sufficiently  large 
amount  of  fiber  links  throughout  the  network.  However,  installation  costs  could  easily  rise,  espe¬ 
cially  in  a  public  inter-office  network  which  covers  a  geographically  wide  area.  Therefore,  a 
proper  network  design  procedure  is  essential  in  deploying  a  survivable  network  in  a  cost-effective 
manner. 

This  project  addresses  these  two  open  issues  for  survivable  ATM  networks  with  self-healing 
capabilities.  The  interrelationship  of  these  survivability  issues  is  illustrated  in  Figure  1-1.  Given  an 
available  physical  network  resource  and  a  traffic  demand,  the  survivable  virtual  path  routing 
(SVPR)  algorithm  finds  a  virtual  path  routing  and  bandwidth  assignment  which  minimizes  an  unre¬ 
coverable  amount  of  flow  due  to  possible  network  failures.  This  problem  arises  when  there  is  a 
change  in  the  traffic  demand  patterns  or  in  the  underlying  facility  network.  The  Dynamic  virtual 
path  reconfiguration  algorithm  is  performed  on  such  occasions  to  maximize  the  restorability  of  the 
fast  restoration  procedure  under  the  new  network  environment.  This  function  offers  a  better  sur- 
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vivability  under  a  changing  network  environment,  especially  for  a  multi-rate  service  system  like 
an  ATM  network  where  gross  traffic  demand  patterns  could  be  susceptible  to  a  change  in  a  rela¬ 
tively  small  number  of  high  volume  traffic.  Furthermore,  a  fully  restorable  network  can  be  realized 
with  less  resource  installation  in  a  dynamically  reconfigurable  network  compared  to  its  static 
counterpart.  Since  physical  link  capacity  cannot  be  increased  in  real-time,  static  VP  configuration 
requires  overengineering  in  order  to  cope  with  varying  demand  patterns  [29]. 
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Given  a  network  topology  and  a  projected  traffic  demand,  the  survivable  network  synthesis  algo¬ 
rithm  seeks  for  the  most  economical  link  capacity  assignment  that  assures  full  restorability  against 
any  major  network  failure.  This  design  process  and  the  construction  of  an  additional  facility  are 
initiated  whenever  the  dynamic  VP  reconfiguration  process  cannot  maintain  survivability  at  a 
desired  level  due  to  a  growth  of  traffic  demand  over  months  or  years. 

1.2.  Previous  Work 

Three  related  research  fields  and  their  previous  development  are  reviewed  in  this  section,  includ¬ 
ing  survivable/reliable  network  management  in  conventional  networks,  fast  restoration  schemes, 
and  dynamic  virtual  path  management  in  ATM  networks.  Virtually  all  works  have  been  developed 
in  separate  contexts,  and  none  have  consolidated  all  three  elements. 

Network  synthesis  and  flow  assignment  problems  have  been  widely  discussed  in  the  context  of 
traditional  telephone  networks  as  well  as  conventional  store-and-forward  data  networks,  although 
most  works  aim  at  designing  networks  under  normal  operating  conditions.  Relatively  fewer  works 
exist  for  survivable/reliable  network  management,  where  traditional  network  reliability  measures 
for  low-speed  networks  are  accommodated  in  their  problem  formulation.  Depending  on  the  reli¬ 
ability  criterion,  these  works  can  be  roughly  categorized  into  multiple  connectivity  problems  and 
non-simultaneous  flow  assignment  problems. 

The  multiple  connectivity  approach  has  been  used  extensively  in  traditional  reliability  literatures 
[3]  [25].  It  ensures  the  existence  of  multiple  diverse  connections  between  any  communicating  pair. 
Recently,  the  idea  has  been  applied  to  reliable  virtual  path  management  [53].  The  availability  of 
virtual  paths  is  guaranteed  even  with  a  transmission  link  failure  by  distributing  bandwidth  over  at 
least  two  VP’s  of  every  switching  pair.  As  for  a  network  synthesis  problem,  the  multiple  connec¬ 
tivity  approach  uses  global  availability'^  as  a  network  reliability  measure  and  designs  a  network 
with  a  minimum  cost  subject  to  the  upper  bound  of  network  unavailability.  This  approach  would 
be  appropriate  for  traditional  store-and-forward  networks  where  multiple  connectivity  prevents  the 
connection  from  being  broken  and  allows  rerouting.  Network  congestion  and  packet  overflow 
could  be  readily  avoided  by  reactively  suppressing  the  flux  of  packets  into  the  network.  Although 
this  would  incur  additional  delay  and  temporal  degradation  of  quality  of  service,  large  data  loss 


4.  Global  availability  is  defined  as  the  probability  that  the  network  is  simply  connected. 
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can  be  prevented.  Lost  packets  can  also  be  recovered  by  a  small  number  of  packet  retransmissions. 
On  the  other  hand,  this  reactive  flow  control  is  inappropriate  for  high-speed  networks  since  a  large 
amount  of  cells  are  under  transmission  over  giga-bit  links.  Thus,  huge  data  loss  is  unavoidable 
before  reactive  action  can  be  initiated  due  to  relatively  slow  feedback  [10]. 

The  connectivity-based  reliability  measure  also  poses  a  serious  problem  for  traditional  low- 
speed  networks.  The  approach  overlooks  the  level  of  performance  experienced  by  users  in  the 
presence  of  a  failure  [25].  A  significant  number  of  queuing  delays  or  lost  calls  would  result  unless 
sufficient  bandwidth  is  provided.  Moreover,  a  connectivity  measure  would  be  inappropriate  for  a 
conventional  telecommunication  network  because  it  is  typically  well-connected  even  after  a  net¬ 
work  failure  [70].  The  non-simultaneous  flow  assignment  approach  has  been  proposed  to  remedy 
the  above  pitfalls.  The  approach  chooses  a  set  of  failure  events  which  are  most  likely  to  happen 
and  solves  multiple  flow  assignment  problems  corresponding  to  each  failure  scenario  [70]. 
Expected  lost  calls  can  be  employed  as  a  reliability  measure  instead  of  connectivity.  Recently,  the 
idea  has  been  applied  to  SONET-based  transport  networks  [29]  as  well  as  ATM  VP-based  trans¬ 
port  networks  [30].  They  assure  full  service  restorability  for  any  failure  pattern  in  the  predefined 
scenarios,  and  a  new  call  will  be  rejected  if  there  is  no  way  to  allocate  it  without  violating  this  full 
restorability  requirement.  This  means  that  call  level  quality  of  service  (QOS)  may  be  degraded 
during  a  busy  period  in  order  to  achieve  100%  restorability  against  failures,  which  should  be  a  rel¬ 
atively  rare  event.  The  network  synthesis  problem  based  on  the  non-simultaneous  flow  approach 
minimizes  the  resource  installation  cost  while  assuring  the  existence  of  a  feasible  flow  assignment 
in  any  failure  state  [41]  [58]  [59].  The  problem  can  be  formulated  as  a  non-simultaneous  multi- 
commodity  flow  problem.  The  resulting  network  can  accommodate  the  traffic  demand  projected  at 
the  planning  phase  even  in  the  presence  of  any  failure  scenario  considered  in  the  design. 

A  major  weakness  of  the  above  approaches  is  their  failure  to  model  a  lost  flow  upon  a  network 
failure.  They  assume  it  can  be  neglected  as  a  temporal  event.  Although  the  obtained  flow  assign¬ 
ment  can  satisfy  traffic  demand  at  a  steady  state  even  if  a  failure  is  present,  it  could  incur  critical 
service  interruption  when  a  system  switches  to  another  steady  state  upon  failure.  This  could  be 
especially  problematic  in  high-speed  wide-area  networks  based  on  ATM  technologies  with  each 
link  operating  at  the  order  of  Gbps  to  Tbps.  Not  only  a  very  precise  synchronization  of  path 
switching  is  necessary  over  the  network,  but  also  the  cell  sequence  of  virtual  channels  in  all  VP’s 
must  be  preserved.  Since  a  very  large  number  of  ATM  cells  are  under  transmission  over  gigabit  or 
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terabit  transmission  links,  even  a  small  difference  in  the  path  length  between  old  and  new  routes 
entails  a  large  delay  gap  of  the  cells  traversing  over  the  routes.  This  huge  gap  must  be  absorbed 
somehow  upon  path  switching  for  all  existing  VP’s  across  the  network,  which  is  almost  impracti¬ 
cal.  Path  switching  without  this  precaution  will  cause  severe  service  degradation  all  over  the  net¬ 
work,  affecting  even  the  connections  which  do  not  suffer  directly  from  the  failure.  A  huge  number 
of  dropped  cells  must  be  retransmitted  for  all  VP’s  at  once,  which  would  significantly  overload  the 
entire  network  and  might  cause  long-lasting  total  service  degradation.  On  the  other  hand,  the  self- 
healing  algorithm  reroutes  only  affected  VP’s  at  the  time  of  a  failure.  Although  temporal  service 
interruption  is  still  inevitable  due  to  the  delay  from  the  time  of  failure  to  the  activation  of  alternate 
VP’s,  it  is  mostly  confined  to  the  VP’s  directly  affected  by  the  failure.  Thus,  the  service  degrada¬ 
tion  would  disappear  more  quickly.  Finally,  the  network  designed  by  the  non-simultaneous  flow 
assignment  approach  might  not  be  able  to  recover  all  affected  flow  since  a  self-healing  algorithm 
usually  requires  extra  capacity  to  fulfill  the  mn-time  restoration.  In  order  to  circumvent  the  above 
issues,  a  fast  restoration  protocol  must  be  modeled  in  the  problem  formulation,  although  virtually 
all  previous  literature  on  survivable  network  synthesis  and  survivable  flow  assignment  have  failed 
to  meet  this  requirement. 

Fast  restoration  schemes  have  been  extensively  discussed^  since  Grover  introduced  the  idea  of 
self-healing  networks  [35].  However,  none  have  addressed  the  survivable  flow  assignment  prob¬ 
lem  in  this  context.  Only  a  few  papers  have  discussed  the  capacity  design  problem  for  an  STM- 
based  self-healing  network  [37]  [68]  [69]  [78].  Assuming  a  predetermined  link  flow  assignment, 
they  seek  a  spare  link  placement  with  minimum  capacity  installation  cost,  subject  to  the  constraint 
that  fast  restoration  would  succeed  under  any  single  link  failure.  The  obtained  solution,  however, 
is  not  necessarily  optimal  for  a  given  projected  traffic  demand  because  a  different  flow  assignment 
may  produce  a  more  economical  capacity  assignment.  Instead,  joint  optimization  of  link  capacity 
and  virtual  path  flow  assignment  must  be  carried  out  in  order  to  find  a  truly  optimal  solution  [24]. 

In  the  previous  works  on  self-healing  schemes,  the  main  focus  has  been  placed  on  the  restoration 
speed  and  the  restoration  ratio  of  the  restoration  protocols.  The  effectiveness  of  the  proposed  algo¬ 
rithms  has  been  demonstrated  with  respect  to  these  two  measures.  A  few  sample  networks  have 
been  employed  in  these  experiments,  where  a  physical  link  placement  and  a  link  flow  assignment 
are  given.  However,  the  performance  of  a  fast  restoration  protocol  is  largely  determined  by  physi- 

5.  Refer  to  Section  2. 1 .2  for  references  and  details. 
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cal  network  resource  and  flow  allocation  as  discussed  in  the  previous  section.  Thus,  the  evaluation 
of  the  fast  restoration  schemes  on  an  arbitrarily  assigned  network  resource  might  not  produce 
meaningful  results,  although  such  an  approach  has  been  often  taken  in  the  literature.  This  is  mainly 
because  previous  works  lack  an  integrated  view  of  survivable  network  management  from  network 
design  and  flow  assignment  to  fast  restoration  protocol.  The  current  literature  also  lacks  compara¬ 
tive  study  of  different  restoration  protocols  through  quantitative  measures.  Although  several  major 
alternatives  exist  for  self-healing  protocols  which  significantly  differ  in  their  implementation  strat¬ 
egies  (see  Section  2.1.2  for  details),  their  comparison  has  been  performed  only  qualitatively  or 
through  simulation  on  a  sample  network. 

Dynamic  virtual  path  (VP)  management  has  become  a  hot  topic  in  the  field  of  ATM  traffic  man¬ 
agement  [27]  [47]  [51]  [62]  [64]  [65]  [71]  [72]  [73].  The  introduction  of  virtual  path  concepts  sig¬ 
nificantly  reduces  the  burden  of  ATM  traffic  control  due  to  its  ability  to  handle  multiple  virtual 
channels  as  a  bundle  (see  Section  2.1.1  for  details),  but  it  comes  at  the  expense  of  resource  utiliza¬ 
tion.  By  adaptively  updating  virtual  path  routing  and  bandwidth  assignment,  a  dynamic  VP  man¬ 
agement  scheme  attempts  to  enhance  the  network  resource  utilization  under  a  quality  of  service 
(QOS)  objective,  such  as  cell  loss  rate  and  call  blocking  probability.  However,  the  vast  majority  of 
the  previous  works  were  conducted  under  an  implicit  assumption  of  a  ‘fail-free’  network.  Only 
very  recently  has  the  issue  of  survivable  dynamic  virtual  path  management  been  addressed  in  the 
literature  [30],  but  fast  restoration  has  not  been  modeled  in  their  approach. 

1.3.  Objectives  and  Organization 

This  project  aims  to  define  a  complete  framework  of  the  survivable  network  management  rang¬ 
ing  from  a  physical  network  resource  design  and  a  dynamic  virtual  path  reconfiguration  to  fast  res¬ 
toration^.  This  is  the  first  work  to  incorporate  fast  restoration  control  in  ATM  resource 
management.  The  relevant  issues  have  been  previously  addressed  without  any  mutual  consider¬ 
ation  of  other  important  elements.  As  shown  in  Figure  1-2,  the  proposed  network  management  sys¬ 
tem  integrates  them  in  order  to  realize  efficient  and  cost-effective  ATM  network  management  with 
restoration  capabilities.  The  fast  restoration  capability  is  especially  important  for  public  long-haul 


6.  The  term  “survivability”  has  been  used  in  various  contexts  in  the  literature.  For  example,  it  was  considered  to  be  ‘the 
capability  of  a  network  where  a  certain  percentage  of  the  traffic  can  still  be  carried  immediately  after  a  failure’  [57],  In 
this  project,  ‘survivability”  is  defined  as  ‘the  ability  to  provide  service  continuity  upon  network  failure’.  In  the  ATM  net¬ 
work  environment,  it  can  be  interpreted  as  the  restorability  of  the  fast  VP  restoration  scheme. 


-8- 


or  metropolitan  exchange  networks.  Since  a  large  amount  of  traffic  is  aggregated,  an  impact  of  a 
network  failure  is  extremely  large  in  such  networks.  This  research  considers  the  ATM  inter-office 
networks  as  its  primary  application  area. 

The  focus  is  placed  on  the  survivable  network  synthesis  problem  and  the  survivable  virtual  path 
routing  (SVPR)  problem.  The  key  objectives  of  this  project  are  1)  to  develop  the  optimization 
models  which  take  into  account  fast  restoration  schemes  and  2)  to  propose  and  implement  the 
solution  procedures  for  the  models  in  order  to  improve  ATM  network  survivability.  Three  repre¬ 
sentative  fast  restoration  protocols  are  considered  (see  Section  2.1.2  for  details),  and  they  are  inde¬ 
pendently  modeled  in  the -formulation  of  the  survivable  network  synthesis  as  well  as  SVPR 
problems.  The  accommodation  of  a  fast  restoration  protocol  considerably  enlarges  the  size  of  the 
problem.  Thus,  the  challenge  is  to  develop  algorithms  which  make  problems  computationally  trac- 


Previous  Works 


Figure  1-2.  An  integrated  view  of  survivabie  ATM  network  management 
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table  for  networks  of  a  realistic  size.  Survivable  ATM  network  management  architecture  is  also 
introduced  to  simplify  the  complexity  of  survivable  network  management  problems. 

As  for  the  network  synthesis  problem,  the  joint  optimization  of  capacity  and  flow  assignment  is 
carried  out  to  find  a  globally  optimal  solution  for  a  given  link  cost  function^.  Since  there  is  mutual 
dependency  between  the  capacity  placement  and  the  survivable  flow  assignment,  the  obtained 
solution  cannot  be  claimed  to  be  optimal  if  these  problems  are  treated  separately.  We  call  this 
problem  the  Survivable  Capacity  and  Flow  Assignment  (SCFA)  problem.  The  problem  formulation 
differs  by  restoration  schemes,  and  distinct  solution  procedures  are  developed  accordingly.  In  this 

Q 

design,  full  restorability  against  any  single  link  failure  is  assured  for  a  projected  traffic  demand  . 

The  SVPR  problem  aims  to  achieve  a  higher  level  of  survivability  by  minimizing  the  expected 
amount  of  lost  flow  upon  fast  restoration  from  a  link  failure.  A  VP  configuration  and  bandwidth 
assignment  is  reconfigured  in  response  to  a  dynamic  change  of  network  environment  so  that  a  self- 
healing  algorithm  would  succeed.  Since  the  amount  of  lost  flow  is  calculated  based  on  the  self- 
healing  algorithm,  the  solution  gives  an  optimal  traffic  distribution  with  minimum  service  inter¬ 
ruption.  We  also  introduce  a  concept  of  two-step  restoration  which  accommodates  two  contradict¬ 
ing  requirements  after  a  failure:  fast  restoration  and  optimal  VP  reconfiguration. 

The  proposed  optimization  procedures  give  us  a  tool  for  comparative  study  among  different  fast 
restoration  schemes.  As  mentioned  before,  their  pros  and  cons  have  been  only  discussed  qualita¬ 
tively  or  based  on  simulation  results  on  a  certain  sample  network.  With  the  SCFA  solution  proce¬ 
dures,  restoration  schemes  can  be  quantitatively  compared  with  respect  to  the  optimal  resource 
installation  cost.  This  comparative  analysis  helps  to  clarify  the  economical  benefit  of  the  restora¬ 
tion  schemes.  The  comparison  is  performed  using  several  networks  with  diverse  topological  char¬ 
acteristics  as  well  as  various  projected  traffic  demand  patterns.  The  SVPR  solution  procedures 
further  elucidate  the  advantage  of  the  restoration  protocols  in  terms  of  their  attainable  restorability. 
An  interesting  result  contrary  to  a  widely  believed  hypothesis  is  drawn  from  this  analysis. 

This  report  is  organized  as  follows.  The  next  section  elaborates  the  proposed  survivable  ATM 
network  management  system  and  introduces  the  survivable  ATM  network  management  architec- 

7.  The  word  ‘optimality’  is  used  throughout  the  report  in  a  sense  that  an  obtained  solution  could  minimize  or  maximize 
a  given  objective  function. 

8.  Thus,  this  design  aims  to  construct  a  cost-effective  one-fault-tolerant  network. 
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ture  and  the  two-step  restoration  concept.  The  SCFA  problem  is  discussed  in  Section  III,  including 
the  problem  formulations  as  well  as  the  development  of  solution  procedures.  Comprehensive  anal¬ 
ysis  of  the  economic  benefit  of  each  restoration  protocol  is  also  explored.  Section  IV  is  devoted  to 
the  SVPR  problem,  the  problem  formulations  and  their  solution  procedures.  The  effectiveness  of 
the  proposed  dynamic  VP  configuration  schemes  as  well  as  the  two-step  restoration  scheme  is 
demonstrated  through  numerical  experiments.  These  experiments  employ  the  link  capacity  assign¬ 
ment  obtained  through  the  SCFA  solution  procedure  for  a  given  projected  traffic  demand.  The  per¬ 
formance  of  the  proposed  SVPR  procedures  is  examined  at  various  network  loads  around  the 
projected  demand  arising  in  a  practical  situation.  Finally,  Section  V  summarizes  the  major  contri¬ 
butions  of  this  research,  and  the  report  concludes  with  possible  extensions  for  future  research. 
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II  A  Survivable  ATM  Network  Management  System 


ATM  network  resource  management  requires  highly  complicated  procedures  since  resource 
allocation  requests  from  several  levels  of  traffic  entities  (i.e.  ATM  cells,  calls  and  virtual  paths) 
must  be  handled  effectively  to  meet  various  types  of  quality  of  service  (QOS)  objectives  (e.g.  cell 
loss  rate  and  call  blocking  rate).  A  layered  management  architecture  can  simplify  the  network 
management  process  by  classifying  different  types  of  network  resources  and  traffic  entities  into 
layers  [40].  However,  the  existing  architecture  is  developed  without  any  consideration  of  network 
survivability.  This  work  proposes  a  survivable  ATM  network  management  architecture,  where  sur¬ 
vivability  functions  are  embedded  into  the  virtual  path  layer  and  the  facility  network  layer. 
Dynamic  network  reconfiguration  is  performed  at  these  two  layers  in  order  to  enhance  network 
survivability.  A  two-step  restoration  strategy  is  further  proposed  to  meet  two  contradicting  require¬ 
ments  of  virtual  path  restoration  and  is  implemented  at  the  virtual  path  layer.  A  survivability  mea¬ 
sure  is  defined  to  assess  the  actual  amount  of  loss  the  network  would  suffer  in  the  event  of  a 
failure.  This  measure  is  employed  as  a  decision  criterion  for  resource  allocation  control  in  a  sur¬ 
vivable  ATM  network  management  system. 

2.1.  Background 

This  section  reviews  the  technologies  relevant  to  survivable  ATM  network  management.  They 
include  the  ATM  and  virtual  path  concepts  and  the  ATM  resource  management  scheme.  ATM 
resource  management  has  been  widely  discussed  in  the  literature,  but  all  previous  works  lack  the 
consideration  of  network  survivability.  This  section  also  surveys  a  variety  of  fast  restoration 
schemes  in  detail. 

2.1.1.  ATM  and  virtual  path  concepts 

Asynchronous  Transfer  Mode  (ATM)  has  been  adopted  by  CCITT  as  a  transfer  mode  for  Broad¬ 
band  Integrated  Service  Digital  Network  (BISDN)  [1].  The  ATM  is  a  switching  and  multiplexing 
technique  based  on  a  short,  fixed-size  packet  called  a  cell  [61].  Unifying  the  packet  stmcture 
enables  fast  switching  and  allows  multi-media  data  with  diverse  traffic  characteristics  to  be  sup- 
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ported  in  an  integrated  manner.  Figure  2-1  depicts  the  ATM  cell  format  standardized  by  CCITT. 

A  logical  transport  path,  called  a  Virtual  Path  (VP),  has  been  proposed  for  the  ATM  networks 
[71]  [72]  and  adopted  by  CCITT  [1],  A  VP  is  a  logical  connection  for  a  pair  of  nodes  which  are  not 
necessarily  linked  by  a  single  physical  cable.  As  shown  in  Figure  2-2,  virtual  paths  create  a  vutual 
path  sub-network  over  a  facility  network.  Multiple  virtual  channels  (individual  service  connec¬ 
tions)  are  accommodated  in  a  VP,  and  they  are  processed  as  a  bundle.  Each  VP  is  assigned  a  route 
and  a  bandwidth  that  determines  the  upper  limit  of  virtual  channels  contained  in  the  path.  A  virtual 
path  is  specified  by  the  Virtual  Path  Identifier  (VPI)  within  a  header,  while  a  virtual  channel  is 
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GFC:  generic  flow  control 
VPI:  virtual  path  identifier 
VCI:  virtual  channel  identifier 
PT:  payload  type 
CL:  cell  loss  priority 
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R:  reserved 
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Figure  2-1.  The  ATM  cell  structure 
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Figure  2-2.  A  virtual  path  sub-network 
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identified  by  the  combination  of  the  VPI  and  the  Virtual  Channel  Identifier  (VCI)  (Figure  2-1). 

Figure  2-3  illustrates  an  example  of  a  VP-based  ATM  transport  system.  Direct  virtual  paths  are 
established  from  Node  1  to  Node  3  through  Node  2  (VP  1)  as  well  as  from  Node  1  to  Node  2  (VP 
2).  Each  virtual  path  is  shared  by  multiple  virtual  channels.  Two  types  of  ATM  switches  are  intro¬ 
duced:  an  ATM  virtual  path  (AVP)  switch  and  an  ATM  virtual  channel  (AVC)  switch.  An  AVP 
switch  is  directly  connected  to  transport  links  at  each  node,  while  an  AVC  switch  is  installed 
between  an  AVP  switch  and  subscriber  lines.  An  AVP  switch  checks  only  the  VPI  field  of  incom¬ 
ing  cells.  A  cell  is  relayed  to  an  outgoing  tmnk  if  the  node  is  a  transit  node  for  the  VP  the  cell 
belongs  to  (e.g.  Node  2  for  VP  1  in  the  above  example).  A  cell  is  sent  to  an  AVC  switch  only  if  the 
VP  is  terminated  at  the  node  (e.g.  VP  2  at  Node  2).  The  AVC  identifies  the  service  entity  for  the 
cell  based  on  the  VPI  and  VCI  values  and  directs  the  cell  to  end-user  equipment  or  to  another  VP. 

Major  advantages  of  a  virtual  path  are  simplified  network  management,  increased  flexibility  in 
resource  management  and  enhanced  network  reliability.  First  of  all,  the  task  of  a  call  set-up  is  sim¬ 
plified  by  the  introduction  of  virtual  paths.  A  new  call  is  accepted  or  rejected  depending  on  the 
current  usage  of  a  virtual  path  bandwidth.  If  accepted,  a  part  of  VP  bandwidth  is  reserved  for  this 
call.  This  call  admission  and  bandwidth  allocation  process  can  be  performed  only  at  the  source 
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Figure  2-3.  A  VP-based  ATM  transport  system 
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node  of  a  VP.  Furthermore,  the  routing  tables  must  be  updated  only  at  the  end  nodes  of  a  virtual 
path.  Thus,  the  transit  nodes  of  a  VP  are  free  from  call-by-call  based  processing  during  a  call  set¬ 
up  time  [71].  The  processing  load  of  a  transit  node  is  further  reduced  by  grouping  virtual  channels 
into  a  VP.  Cells  are  relayed  to  an  outgoing  link  only  by  checking  their  VPI  field  at  an  AVP  switch. 
Without  a  VP,  each  node  must  hold  switching  information  on  all  existing  connections  in  a  look-up 
table,  including  the  ones  relayed  at  the  node. 

Dynamic  VP  reconfiguration  capability  gives  flexibility  in  network  resource  management  and 
realizes  a  high  utilization  of  transmission  resources.  The  VP  routing  and  bandwidth  allocation  can 
be  updated  in  response  to  a  long-term  fluctuation  or  growth  of  the  traffic  demand.  If  a  VP  band¬ 
width  must  be  fixed,  it  should  be  equal  to  the  maximum  required  capacity  in  the  long  run.  This 
static  VP  configuration  is  very  inefficient  since  a  link  capacity  cannot  effectively  be  allocated  to 
virtual  paths  under  varying  traffic  demand  patterns  [65].  The  best  resource  utilization  can  be 
achieved  by  optimizing  the  VP  routing  and  bandwidth  assignment  over  a  physical  network  for  a 
given  traffic  demand.  A  dynamic  VP  reconfiguration  approach  can  also  adapt  to  a  topological 
change  in  the  physical  network,  such  as  a  new  facility  installation  or  a  change  due  to  a  network 
failure. 

Finally,  and  most  importantly  for  survivable  network  management,  virtual  paths  can  be  rerouted 
in  case  of  failure.  Since  VP  restoration  requires  fewer  operations  than  rerouting  each  call  individu¬ 
ally,  it  is  expected  to  restore  more  services  in  a  shorter  period  [82]. 

2.1.2.  Fast  path  restoration 

Extensive  research  has  been  conducted  on  fast  path  restoration  schemes  in  mesh-type  long-haul 
or  metropolitan  exchange  networks  since  Grover  [35]  introduced  an  idea  of  self-healing  networks 
in  1987  [11]  [15]  [18]  [36]  [45]  [68]  [79]  [80]  [82]  [83].  Originally,  the  self-healing  network  was 
designed  for  a  synchronized  path  (e.g.  the  DS-3  path)  restoration  in  STM  networks.  An  automatic 
synchronized  path  switching  system  called  a  Digital  Cross  Connect  (DCC)  switch  is  utilized  to 
switch  affected  paths  over  alternate  routes  upon  a  network  failure.  Recently,  the  same  technology 
has  been  extended  to  ATM  networks  where  restoration  path  switching  is  carried  out  at  a  virtual 
path  level  [5]  [23]  [43]  [52].  Due  to  fast  ATM  cross  connect  switching,  ATM  virtual  path  restora¬ 
tion  has  the  potential  of  achieving  rapid  restoration  [82].  Furthermore,  the  spare  capacity  can  be 
allocated  logically  over  working  trunks  in  a  VP-based  restoration  system.  Thus,  the  spare  band¬ 
width  can  also  help  alleviate  temporal  traffic  congestion. 
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When  a  network  failure  happens,  the  fast  path  restoration  mechanism  recovers  affected  transport 
paths  by  rerouting  them  over  alternate  paths  using  spare  bandwidth.  A  restoration  protocol  typi¬ 
cally  takes  the  following  four  phases: 

Phase  1.  Failure  detection  and  notification 

Phase  2.  Restoration  path  hunting 

Phase  3.  Restoration  bandwidth  reservation 

Phase  4.  Path  rerouting 

The  following  is  one  of  the  typical  scenarios  employed  in  self-healing  networks  [82]  [83]:  When  a 
link  failure  happens  (Phase  1),  one  end  of  the  failed  facility  is  designated  a  Sender  node.  A  Sender 
broadcasts  restoration  messages  toward  the  other  end  of  the  link,  which  is  called  a  Chooser  node 
(Phase  2).  All  nodes  other  than  a  Sender  and  a  Chooser  work  as  a  tandem  node.  A  hop  count  limit 
is  imposed  at  this  message  flooding  phase  in  order  to  prevent  unnecessary  message  circulation.  A 
tandem  node  writes  its  node  ID  and  available  spare  bandwidth  information  into  a  message,  and  the 
node  relays  the  updated  message  to  its  adjacent  nodes.  The  routing  information  is  accumulated  as 
a  message  traverses.  When  a  message  reaches  the  Chooser  node,  the  stored  information  constitutes 
an  available  restoration  route  between  the  two  end  nodes.  Upon  arrival  of  the  message,  the 
Chooser  node  sends  an  acknowledgment  message  back  to  the  Sender  over  the  discovered  alternate 
route.  The  spare  capacity  is  reserved  at  this  phase  (Phase  3).  Finally,  the  Sender  issues  a  confirma¬ 
tion  message  back  to  the  Chooser  which  subsequently  triggers  the  transport  path  switching  over 
the  route  (Phase  4).  This  procedure  usually  results  in  finding  a  set  of  successive  shortest  paths  and 
reserving  their  maximum  spare  bandwidth  until  enough  restoration  capacity  is  found.  A  double¬ 
search  algorithm  is  also  proposed  to  perform  bidirectional  restoration  [23].  Two  Sender  nodes  and 
two  Chooser  nodes  are  defined,  and  the  above  procedures  proceed  in  parallel  from  the  two  end 
nodes  of  a  failed  link  to  find  restoration  paths  in  both  directions. 

Various  other  implementation  techniques  have  been  also  proposed.  Fast  restoration  algorithms 
can  be  classified  as  follows  by  their  path  rerouting  strategies  [19]  [82]: 

1.  Restoration  path  selection:  k-shortest  path-based  versus  maximum  flow-based. 

2.  Network  segment  to  restore:  line  restoration  versus  end-to-end  restoration^ 

The  self-healing  algorithm  discussed  above  employs  k-shortest  path-based  line  restoration. 


1.  Some  literature  refers  to  line  restoration  as  link  restoration  or  local  rerouting.  End-to-end  restoration  is  also  referred 
to  as  path  restoration,  point-to-point  restoration,  or  source-based  rerouting. 
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Realizing  faster  restoration  and  restoring  more  affected  flow  are  two  major  goals  in  self-healing 
networks,  which  lead  to  two  alternative  restoration  path  selection  strategies:  /:-shortest  path  (KSP) 
based  restoration  and  maximum  flow  (MF)  based  restoration  [20]  [82]  (Figure  2-4).  The  KSP- 
based  restoration  aims  for  quick  recovery  of  affected  traffic.  As  soon  as  a  restoration  path  is  found 
in  Phase  2,  the  KSP  restoration  protocol  reserves  the  available  bandwidth  over  the  path  in  Phase  3 
and  reroutes  (a  part  of)  affected  demands  over  this  newly  discovered  alternate  path  in  Phase  4. 
This  operation  is  repeated  in  parallel  until  all  affected  bandwidths  are  restored  or  the  restoration 
process  is  terminated  due  to  a  lack  of  available  spare  bandwidth.  In  other  words,  the  KSP  restora¬ 
tion  algorithm  recovers  interrapted  traffic  at  the  earliest  possible  time,  whenever  any  part  of  the 
restoration  paths  become  available.  However,  this  procedure  may  not  be  able  to  find  the  maximum 
available  spare  capacity  [79]  (see  Figure  2-4).  The  MF-based  restoration  approach,  on  the  other 
hand,  aims  to  find  a  rerouting  pattern  which  allows  maximum  bandwidth  to  be  restored.  For  exam¬ 
ple,  Whalen  et.al.  [79]  propose  to  employ  a  variant  of  the  maximum  flow  algorithm. 

Although  the  MF  algorithm  can  detect  more  restoration  bandwidth,  it  is  computationally  com¬ 
plex  and  very  hard  to  meet  the  2-second  restoration  time  objective.  Due  to  the  substantial  delay,  it 
is  imperative  that  the  MF-based  fast  restoration  system  precalculates  restoration  routes  and  plans 
rerouting  patterns  beforehand  [11].  The  information  on  the  restoration  routes  is  distributed  and 
stored  at  relevant  nodes.  When  a  failure  occurs,  affected  traffic  is  rerouted  using  this  information. 


Failed  Link 


(a)  it-shortest  path-based  restoration  (b)  maximum  flow-based  restoration 

Figure  2-4.  Two  restoration  path  seiection  strategies 

Assume  all  working  links  possess  a  unit  spare  bandwidth.  Then  only  a  unit  flow  is  restored  with  k- 
shortest  path-based  restoration,  while  two  unit  flows  are  recovered  with  maximum  flow-based 
restoration  (restoration  paths  are  shown  in  bold  lines).  This  example  is  taken  from  [20].  This  type  of 
network  configuration  is  named  a  ‘trap’  topology  in  the  reference.  They  attributed  to  it  the  failure  for  the 
it-shortest  path  strategy  to  find  a  maximum  flow  assignment. 


With  restoration  path  preplanning,  the  MF  restoration  scheme  could  realize  even  faster  recovery 
than  the  KSP  restoration  scheme.  However,  a  larger  memory  space  is  required  at  each  node  and, 
more  importantly,  it  is  difficult  to  keep  up-to-date  restoration  information  in  a  frequently  changing 
network  environment.  On  the  other  hand,  the  KSP  restoration  protocol  does  not  require  any 
knowledge  of  the  current  spare  bandwidth  distribution  over  the  network.  Therefore,  it  would  be 
preferable  for  networks  involving  a  frequent  change  in  the  VP-level  demand  patterns. 

Depending  on  the  location  where  traffic  rerouting  is  performed,  the  network  restoration  strate¬ 
gies  can  be  further  categorized  into  two  classes:  line  restoration  and  end-to-end  restoration  (Figure 
2-5).  When  a  link  failure  occurs,  a  line  restoration  scheme  dispatches  alternate  routes  between  the 
two  end-nodes  of  the  failed  link  and  reroutes  all  affected  traffic  around  the  link.  On  the  other  hand, 
the  end-to-end  restoration  scheme  switches  failed  VP’s  to  alternate  routes  established  between 
their  respective  source  and  destination  nodes.  When  a  failure  is  detected,  recovery  messages  are 
sent  to  the  source  and  destination  nodes  of  all  affected  VP’s  to  inform  the  failure.  At  this  time,  the 
bandwidth  occupied  by  the  affected  VP’s  is  released  over  their  original  routes.  Then,  each  source 


original  VP 


(a)  Line  restoration 


original  VP  A 


(b)  End-to-end  restoration 


Figure  2-5.  Line  restoration  and  end-to-end  restoration 

Upon  a  failure  of  link  2-3,  a  VP  1-2-3-6  is  restored  via  1-4-5-6  with  end-to-end  restoration.  However, 
line  restoration  would  restore  the  same  VP  over  a  longer  path  1-2-4-5-3-6  (restoration  path  #1)  or  over  a 
considerably  more  inefficient  path  1-2-1-4-5-3-6  which  includes  backhauling  [5]  on  link  1-2  (restoration 
path  #2).  All  affected  VP’s  are  rerouted  between  Nodes  2  and  3  with  line  restoration.  On  the  other  hand, 
rerouting  usually  takes  place  at  different  locations  with  end-to-end  restoration.  In  this  example,  the  VP  A 
is  rerouted  between  Nodes  1  and  6,  while  the  VP  B  is  recovered  between  Nodes  4  and  3. 
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node  initiates  virtual  path  switching  over  alternate  routes.  Note  that  all  affected  VP’s  are  rerouted 
between  the  same  end  nodes  with  line  restoration  (between  Nodes  2  and  3  in  the  example  of  Figure 
2-5-a),  while  path  rerouting  takes  place  at  different  locations  with  end-to-end  restoration  (see  Fig¬ 
ure  2-5-b). 

End-to-end  restoration  generally  requires  a  lower  resource  installation  cost  to  realize  a  fully 
restorable  network  since  spare  bandwidth  can  be  utilized  more  efficiently  [5]  [43]  [82].  For  given 
network  resources,  it  has  the  potential  of  attaining  a  higher  restorability  from  a  failure.  Further¬ 
more,  end-to-end  restoration  can  be  applied  to  recovery  from  a  node  failure  as  well  as  to  that  from 
a  link  failure  [43].  On  the  other  hand,  line  restoration  is  expected  to  be  faster  in  terms  of  restora¬ 
tion  time  because  its  restoration  paths  are  generally  shorter  and  involve  fewer  intermediate  nodes. 
However,  the  VP  activation  process  (Phase  4)  could  be  a  bottleneck  for  line  restoration  due  to  the 
localization  of  the  process  around  a  failed  link.  End-to-end  restoration,  on  the  contrary,  could  be 
achieved  fast  enough  since  the  rerouting  task  can  be  distributed  widely  over  the  network  [5].  This 
argument  is  based  on  the  assumption  that  the  restoration  paths  are  precalculated.  The  computa¬ 
tional  complexity  of  the  rerouting  path  decision  process  is  substantially  higher  for  the  end-to-end 
restoration  scheme^. 

According  to  the  above  classification,  there  are  four  possible  variations  for  fast  restoration  proto¬ 
cols:  (1)  MF-based  rerouting  with  line  restoration  (MF-LINE),  (2)  KSP-based  rerouting  with  line 
restoration  (KSP-LINE),  (3)  MF-based  rerouting  with  end-to-end  restoration  (MF-ETE),  and  (4) 
KSP-based  rerouting  with  end-to-end  restoration  (KSP-ETE).  Among  these  combinations,  how¬ 
ever,  the  KSP-ETE  rerouting  appears  to  be  impractical  due  to  a  very  slow  restoration  speed.  Since 
path  rerouting  takes  place  between  source  and  destination  nodes  in  the  KSP-ETE  scheme,  the  res¬ 
toration  messages  must  traverse  in  a  considerably  wider  area  compared  to  the  KSP-LINE  scheme. 
Thus,  a  higher  hop  count  should  be  specified  to  reach  the  destination,  but  this  increases  the  number 
of  broadcast  messages.  Moreover,  this  message  flooding  procedure  must  be  performed  individu¬ 
ally  for  all  source  and  destination  node  pairs  of  the  virtual  paths  affected  by  a  failure.  As  a  result, 
the  number  of  recovery  messages  could  explode  over  the  network,  which  would  significantly 
delay  the  restoration  procedure.  For  this  reason,  the  KSP-ETE  is  not  considered  in  our  study^. 

2.  Given  a  spare  bandwidth  assignment,  line  restoration  seeks  restoration  paths  between  two  end  nodes,  while  end-to- 
end  restoration  attempts  to  find  restoration  paths  between  multiple  node  pairs.  Thus,  the  former  problem  is  formulated  as 
a  single  commodity  flow  problem.  On  the  other  hand,  the  rerouting  path  hunting  problem  for  end-to-end  restoration  is  a 
multicommodity  flow  problem,  whose  complexity  is  significantly  higher  than  that  of  a  single  commodity  flow  problem. 
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2.1.3.  ATM  resource  management 

In  ATM  networks,  resource  allocation  requests  arise  at  various  levels:  ATM  cells  request  a  time 
slot  of  a  transmission  link,  calls  contend  for  a  virtual  path  bandwidth,  and  virtual  paths  claim  phys¬ 
ical  link  capacity.  An  ATM  resource  management  system  must  handle  these  requests  so  that  qual¬ 
ity  of  service  (QOS)  objectives  are  satisfied.  For  example,  cell  loss  rate  and  call  blocking 
probability  must  be  enforced  to  be  less  than  their  designated  levels,  and  virtual  path  reconfigura¬ 
tion  requests  must  be  granted  at  a  higher  possibility  than  a  specified  level.  A  physical  resource 
must  also  be  properly  designed  in  order  to  meet  these  goals.  Finding  the  best  resource  assignment 
strategy,  however,  is  a  very  complicated  task  since  several  different  levels  of  requests  must  be  con¬ 
trolled  under  various  types  of  QOS  objectives. 

In  order  to  reduce  the  complexity,  a  layered  resource  management  architecture  has  been  pro¬ 
posed  for  the  ATM  networks  [40].  In  their  architecmre,  resource  allocation  is  layered  by  time 
scales.  Arrivals  of  cells,  calls,  VP  reconfiguration  requests  and  physical  resource  allocation 
requests  occur  on  different  time  scales.  The  processing  time  of  a  cell  is  on  the  order  of  micro-sec¬ 
onds  or  less,  a  call  session  lasts  for  minutes,  a  virtual  path  configuration  changes  every  few  hours 
or  days,  and  a  new  physical  resource  allocation  is  designed  for  once  a  year  or  over  a  longer  inter¬ 
val.  Typically,  their  time  scales  differ  by  two  or  more  orders  of  magnitude.  Based  on  this  observa¬ 
tion,  the  resource  allocation  control  is  layered  in  four  layers:  a  cell  layer,  a  call  layer,  a  virtual  path 
(VP)  layer  and  a  facility  network  (FN)  layer  (refer  to  Figure  2-6). 

Resource  assignment  at  a  layer  is  performed  so  that  the  QOS  of  the  next  lower  layer  is  guaran¬ 
teed.  When  a  new  call  arrives,  a  call  layer  reserves  sufficient  bandwidth  to  assure  the  cell  level 
QOS  (i.e,  cell  loss  rate  objective)  for  the  connection.  A  virtual  path  layer  performs  reconfiguration 
so  that  enough  virtual  path  bandwidth  is  allocated  to  satisfy  the  call  level  QOS  (i.e.  call  blocking 
rate  objective).  Finally,  a  facility  network  layer  designs  physical  resource  placement  so  that  the 
probability  of  failure  to  meet  virtual  path  bandwidth  request  (i.e.  VP-level  QOS)  is  small. 

The  heterogeneity  of  traffic  supported  in  the  ATM  network  further  complicates  the  traffic  con- 

3.  Lin  et.  al.  [52]  discuss  in  their  recent  paper  a  new  restoration  protocol  which  is  a  variant  of  KSP-ETE.  It  calculates 
candidate  alternate  VP’s  beforehand.  When  a  failure  happens,  the  available  bandwidth  is  searched  only  over  the  preas¬ 
signed  routes.  Their  simulation  result  indicates  that  restoration  can  be  completed  mostly  in  a  second  if  the  protocol  is 
applied  to  a  metropolitan  LATA  network.  In  other  words,  it  might  be  difficult  to  adapt  it  to  larger  networks  even  with  this 
precaution. 
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trol.  The  network  must  support  a  wide  range  of  traffic  with  different  bandwidth  requirements  and 
different  degrees  of  burstiness.  In  a  connection  of  variable  bit  rate  service  (e.g.  compressed  video 
traffic),  the  source  bit  rate  varies  over  time,  and  this  creates  bursty  cell  arrivals.  The  notion  of 
equivalent  bandwidth  has  been  proposed  to  facilitate  the  bandwidth  management  of  variable  bit 
rate  sources  [21]  [38]  [40].  The  equivalent  bandwidth  provides  a  unified  metric  representing  the 
effective  load  for  the  connection.  It  is  computed  based  on  the  source  statistics  and  requested  QOS 
of  the  connection. 

The  proposed  architecture  considerably  simplifies  the  complexity  of  the  ATM  network  manage¬ 
ment.  Due  to  a  large  difference  in  the  operating  time  scales  between  two  adjacent  layers,  the  trans¬ 
mission  resource  configuration  at  a  higher  layer  can  be  regarded  to  be  quasi-static.  Thus,  each 
layer  only  has  to  manage  the  corresponding  traffic  entity^  for  a  given  view  of  an  available  higher- 
level  network  resource  [40].  For  example,  a  call  admission  and  its  routing  can  be  determined  at  the 
call  layer  for  a  given  VP  sub-network,  and  a  virtual  path  configuration  can  be  optimized  at  the  VP 
layer  for  a  given  physical  network.  With  this  limited  view  of  a  network  resource,  the  network  man¬ 
ager  at  each  layer  can  concentrate  on  resource  allocation  of  its  traffic  entity  to  promote  the  QOS  of 
its  own  layer.  For  example,  the  call  layer  can  employ  dynamic  call  routing  to  reduce  call  blocking 
in  a  VP  sub-network  [7]  [46]. 

Another  simplification  attained  by  the  layered  architecture  is  in  the  interaction  between  layers. 
Each  layer  only  has  to  care  about  assuring  the  QOS  objective  of  the  next  lower  layer.  This  QOS 
assurance  is  accomplished  by  assigning  sufficient  bandwidth  at  the  layer.  The  equivalent  band¬ 
width  can  be  defined  at  each  layer,  and  it  expresses  the  required  bandwidth  to  guarantee  the  QOS 
up  to  the  layer  of  concern  [40].  For  example,  consider  the  equivalent  bandwidth  defined  at  the  VP 
layer,  which  we  call  a  VP-level  bandwidth  or  VP-level  traffic  demand.  By  satisfying  VP-level  traf¬ 
fic  demand,  a  resulting  VP  sub-network  can  restrict  the  call  blocking  rate  below  its  designated 
level.  In  the  formulation  of  the  survivable  virtual  path  routing  (SVPR)  problem  as  well  as  the  sur- 
vivable  capacity  and  flow  assignment  (SCFA)  problem  discussed  in  the  following  sections,  the 
end-to-end  flow  requirement  is  assumed  to  be  given  by  a  VP-level  bandwidth.  The  aggregate  traf¬ 
fic  load  can  be  obtained  by  simply  adding  the  equivalent  bandwidth  of  each  connection  [21]. 
Using  this  property,  we  can  define  the  problem  as  a  multicommodity  flow  problem  [9]. 


4.  Traffic  entity  can  be  cells,  calls  or  virtual  paths. 
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2.2.  Survivable  ATM  Network  Management 

2.2.1.  Survivable  ATM  network  management  architecture 


Network  survivability  becomes  increasingly  important  as  society  becomes  more  dependent  on 
communication  networks.  However,  most  previous  work  on  ATM  traffic  control  dwells  on  an 
implicit  assumption  that  all  network  components  are  functioning.  In  order  to  build  a  reliable  ATM 
network,  survivability  functions  must  be  incorporated  into  an  ATM  resource  management  system. 
Their  integration,  on  the  contrary,  increases  the  complexity  of  the  ATM  network  management 
tasks.  To  overcome  the  difficulties,  the  layered  management  architecture  described  in  the  previous 
section  is  extended  to  survivable  ATM  network  management.  The  architecture  can  simplify  the 
resource  management  by  classifying  various  levels  of  network  resources  and  traffic  entities  into 
layers. 

Figure  2-6  illustrates  the  survivable  ATM  network  management  architecture  proposed  in  this 
project.  The  survivability  functions  are  embedded  at  the  VP  and  higher  layers,  considering  the  fact 


SCFA 


SVPR 


Figure  2-6.  A  survivable  ATM  network  management  architecture 
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that  path  level  recovery  enables  rapid  and  efficient  restoration  and  reduces  the  complexity  of  traffic 
management  [82]  (refer  to  Section  2.1.2).  A  survivability  measure  (see  Section  2.2.3  for  details)  is 
introduced  to  capture  the  degree  of  attainable  network  survivability.  At  the  VP  layer,  the  VP  man¬ 
ager  builds  a  VP  sub-network  over  currently  available  physical  network  resources.  Given  a  VP- 
level  traffic  demand  satisfying  call  level  QOS,  the  VP  manager  configures  virtual  paths  so  that  the 
survivability  measure  is  optimally  enhanced.  This  leads  to  the  SVPR  problem  discussed  in  Section 
rV.  The  VP  manager  also  performs  fast  VP  restoration  when  a  network  failure  happens.  If  the  VP 
manager  cannot  maintain  a  survivability  measure  at  a  desired  level  due  to  a  growth  of  traffic 
demand,  the  FN  layer  must  trigger  a  facility  network  planning  process.  The  FN  manager  designs 
an  additional  physical  resource  placement  for  a  newly  projected  traffic  demand  so  as  to  satisfy  the 
survivability  measure  objective^.  The  SCFA  problem  discussed  in  the  next  section  deals  with  this 
physical  network  design  problem. 

2.2.2.  Two-step  restoration  scheme 

Two  contradicting  requirements  should  be  satisfied  upon  virtual  path  restoration.  After  a  failure, 
it  is  desirable  to  realize  an  optimal  VP  configuration  which  incurs  the  least  service  interruption 
upon  a  possible  subsequent  failure.  However,  the  optimal  flow  calculation  introduces  a  computa¬ 
tional  delay,  while  it  is  essential  to  complete  the  restoration  as  quickly  as  possible  for  high-speed 
networks.  In  order  to  accommodate  these  contradicting  requirements,  fast  VP  restoration  and  opti¬ 
mal  VP  reconfiguration,  the  VP  manager  uses  a  two-step  restoration  approach  as  shown  in  Figure 
2-7.  Upon  failure,  the  VP  restoration  manager  executes  the  fast  restoration  procedure  to  accelerate 
a  recovery  from  the  failure.  After  the  restoration  is  completed,  the  VP  planner  module  computes 
an  optimal  VP  assignment  for  the  new  network  topology,  and  a  VP  configuration  is  again  changed 
to  the  newly  calculated  optimal  solution  {network-wide  restoration).  Although  this  scheme  tertqjo- 
rarily  produces  a  flow  which  is  not  optimal  from  the  survivability  viewpoint,  it  is  permissible  in 
practice  since  more  than  one  network  failure  may  not  happen  successively  in  a  short  time. 


5.  The  VP-level  QOS  objective  must  also  be  satisfied  by  the  FN  layer.  Namely,  the  probability  of  failure  to  assign 
requested  VP-level  bandwidth  must  be  less  than  a  designated  level.  Since  a  survivable  network  requires  significantly 
more  physical  resource  installation  for  restoration  purpose,  however,  a  network'  satisfying  the  survivability  measure 
objective  can  also  meet  the  VP-level  QOS  objective.  Therefore,  it  is  not  necessary  to  consider  the  VP-level  QOS  in  the 
survivable  network  management  system. 


-23- 


2.2.3.  Survivability  measure 


The  survivability  measure  gives  a  quantitative  metric  for  the  attained  survivability  level  of  the 
network.  It  is  used  as  a  decision  criterion  for  resource  management  control  in  a  survivable  ATM 
network.  For  example,  the  VP  manager  employs  the  survivability  measure  as  an  objective  function 
of  the  VP  planner  module.  Based  on  this  measure,  the  FN  manager  decides  when  to  initiate  a  facil¬ 
ity  network  planning  process.  The  survivability  measure  is  also  useful  in  evaluating  the  perfor¬ 
mance  of  the  proposed  VP  planner  modules. 

A  good  measure  of  survivability  should  express  the  actual  amount  of  damage  experienced  in  the 
network  or  by  end-users,  instead  of  using  traditional  reliability  criteria  such  as  global  availability. 
A  number  of  lost  calls  have  been  proposed  for  such  a  measure  at  the  call  layer  in  telecommunica¬ 
tion  networks  [55]  [70].  Similarly,  the  amount  of  lost  flow  (VP-level  bandwidth)  can  be  employed 
to  express  damage  to  the  VP  layer.  A  restoration  ratio  has  also  been  used  in  the  literature  of  self- 
healing  networks  [37]  [78]:  It  is  defined  as  an  expected  percentage  of  recoverable  flow  per  span 
failure.  Since  the  average  flow  per  span  depends  on  flow  assignment,  this  measure  does  not  pre¬ 
cisely  express  the  amount  of  damage  in  the  dynamically  reconfigurable  network  environment.  In 
the  following,  the  amount  of  lost  flow  is  employed  as  the  survivability  measure. 

Let  5  be  a  set  of  possible  failure  patterns  and  be  the  probability  of  a  failure  event  s  g  S . 


Network  failure 


Topological  change 
Demand  dynamics 


Figure  2-7.  Two-step  restoration  approach 
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Then,  the  expected  lost  flow  L,  given  by  L  —  ^  ^s’  employed  as  the  survivabil¬ 

ity  measure,  where  is  a  lost  flow  due  to  a  failure  event  s  s  S .  The  selection  of  the  state  space  S 
is  another  factor  for  survivable  network  design.  The  more  failure  patterns  are  included,  the  more 
precise  estimate  of  the  damage  the  survivability  measure  can  express.  However,  this  increases  the 
complexity  of  the  design  procedure.  For  design  purposes,  it  would  be  sufficient  to  consider  a  set  of 
the  most  probable  failure  events  as  a  whole  state  space.  According  to  [82],  a  complete  fiber  cable 
cut  is  the  most  common  and  frequently  reported  failure  event  among  disastrous  network  failures 
over  the  decades.  Therefore,  we  take  the  events  of  a  single  link  failure  as  a  whole  state  space  (5=£) 
and  assume  that  each  failure  is  equally  weighted  (w^  =  l/l£| ),  where  £  is  a  set  of  links  in  the  net¬ 
work.  Note  that  although  a  VP  assignment  is  computed  assuming  a  single  link  failure,  the  system 
can  work  against  any  failure  as  long  as  the  fast  restoration  algorithm  supports  it. 

A  worst  lost  flow  per  failure,  given  by  L  =  max  {L^}  ,  would  be  another  candidate  for  repre- 

S  €  ^ 

seating  the  survivability  measure.  By  minimizing  this  measure,  the  VP  planner  module  will  find  an 
optimal  minimax  routing.  The  minimax  routing  has  been  reported  to  be  preferable  since  the  opti¬ 
mal  solution  tends  to  balance  the  traffic  over  the  network,  resulting  in  better  call  and  cell  level 
QOS  [48].  However,  it  turns  out  to  be  inappropriate  for  the  survivability  measure.  The  problem 
arises  if  there  is  a  link  where  a  significant  loss  becomes  inevitable  due  to  a  demand  growth.  A  typ¬ 
ical  example  is  a  failure  of  either  link  adjacent  to  a  node  of  degree  two.  If  line  restoration  is 
employed,  all  affected  VP’s  must  be  rerouted  over  the  other  remaining  link  incident  to  the  node. 
Thus,  if  the  total  bandwidth  of  either  link  is  less  than  the  demand  originating  and  terminating  at 
this  node,  then  data  loss  is  unavoidable  after  a  failure  of  the  adjacent  link^.  Maximum  loss  per  fail¬ 
ure  would  occur  upon  failure  of  such  a  link.  Then,  any  flow  assignment  can  be  the  optimal  solution 
based  on  the  minimax  function,  as  long  as  no  virtual  paths  are  relayed  at  the  node  to  minimize  this 
maximum  loss.  However,  the  attainable  survivability  level  with  other  link  failure  scenarios  could 
significantly  differ  among  such  assignments,  but  the  minimax  objective  function  cannot  distin¬ 
guish  them. 

2.2.4.  Network  reconfiguration 

The  primary  goal  of  network  reconfiguration  is  to  realize  a  high  utilization  of  network  resources 
while  maintaining  the  design  objectives  of  each  layer  (QOS  or  survivability  measure)  at  an  accept- 

6.  Namely,  the  only  remaining  link  leaving  the  node  cannot  accommodate  the  demand  originating  and  terminating  at  the 
node. 
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able  level.  Three  types  of  network  reconfiguration  are  identified  corresponding  to  the  FN,  VP  and 
call  layers.  Network  reconfiguration  mechanisms  at  the  FN  and  VP  layers  are  related  to  the  surviv¬ 
ability  requirements,  which  lead  to  the  SCFA  and  SVPR  problems  discussed  in  the  subsequent  sec¬ 
tions. 

Each  call  session  typically  lasts  for  minutes.  Thus,  an  aggregate  traffic  demand  may  fluctuate 
over  minutes  to  hours.  This  short-term  demand  fluctuation  can  be  accommodated  at  the  call  layer 
by  applying  a  dynamic  call  routing  mechanism  over  a  VP  subnetwork.  Dynamic  call  routing 
attempts  to  use  an  alternate  route  when  a  primary  route  (e.g.  direct  VP)  is  not  available  (refer  to  [7] 
[8]  and  [46]  for  details).  We  refer  to  this  dynamic  adjustment  of  call  routing  as  short-term  reconfig¬ 
uration.  By  adaptively  assigning  transmission  resources,  dynamic  call  routing  is  expected  to  cope 
with  load  variations  and  to  lower  a  call  blocking  rate.  Furthermore,  the  peak  load  at  a  different 
time-zone  occurs  at  a  different  time  in  a  long-haul  network.  A  dynamic  routing  scheme  routes  a 
part  of  the  peak  traffic  at  one-zone  over  alternate  routes  in  a  non-peak  zone  in  order  to  promote 
resource  utilization  [46].  The  VP-level  reconfiguration  is  required  when  the  call  level  QOS  cannot 
be  maintained  even  with  the  dynamic  call  routing  mechanism  due  to  a  further  traffic  fluctuation 
over  a  longer  time  scale. 

VP-level  dynamic  reconfiguration,  which  is  called  long-term  reconfiguration,  promotes  the 
attainable  survivability  level  in  response  to  a  change  in  network  environments.  In  addition  to  the 
network-wide  restoration  after  a  failure,  there  are  several  scenarios  in  which  the  VP  planner  is 
invoked.  First  of  all,  the  VP  planner  is  triggered  when  the  higher  layer  configuration  (facility  net¬ 
work)  has  changed,  not  only  due  to  a  network  failure  (an  unplanned  change)  but  also  due  to  a 
planned  change  such  as  reintegration  of  the  repaired  component,  a  new  facility  installation,  or  a 
planned  facility  removal  (for  a  component  test,  for  example).  VP  reconfiguration  is  executed  in 
order  to  cope  with  a  change  in  the  underlying  facility  network  and  to  improve  the  network  surviv¬ 
ability.  VP  reconfiguration  process  is  invoked  before  the  planned  removal  of  a  network  resource, 
or  after  the  planned  installation  or  unplanned  change.  These  changes  would  happen  very  infre¬ 
quently,  say  over  months. 

A  dynamic  VP  reconfiguration  procedure  is  invoked  more  frequently  to  cope  with  demand 
dynamics.  The  VP-level  traffic  demand  patterns  may  change  in  hours  due  to  demand  variations 
within  a  day.  It  may  also  fluctuate  in  days  or  weeks  due  to  the  daily  or  seasonal  demand  character- 
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istics  or  due  to  a  long-term  growth  of  traffic  demand.  Owing  to  a  vast  volume  of  traffic  aggregated 
in  the  ATM  inter-office  networks  and  due  to  the  effect  of  dynamic  call  routing,  it  is  expected  that 
short-term  load  variations  would  be  statistically  averaged.  Thus,  the  VP  reconfiguration  interval 
would  be  relatively  long:  It  could  be  a  few  hours  or  days  depending  on  how  fast  the  aggregate  traf¬ 
fic  fluctuates  [40]. 

The  reconfiguration  interval  is  also  influenced  by  the  VP  traffic  control  policy  and  the  speed  of 
the  VP  planner  module.  A  fine  tuning  of  the  VP  configuration  could  lead  to  a  higher  utilization  of 
the  network  resource.  However,  a  frequent  update  increases  the  load  of  the  VP  planner  module. 
Thus,  the  update  interval  must  be  determined  considering  the  trade-off  between  reducing  process¬ 
ing  complexity  and  increasing  network  utilization.  The  VP  planner  should  compute  a  new  VP  con¬ 
figuration  as  quickly  as  possible,  say  in  a  period  one  or  two  orders  of  magnitude  less  than  the  VP 
update  interval.  Thus,  the  speed  of  a  VP  planner  module  further  determines  a  feasible  range  of  a 
VP  configuration  update  interval. 

The  call  manager  at  the  call  layer  monitors  call-level  traffic  activities  and  periodically  calculates 
statistics  on  a  call  blocking  rate  as  well  as  VP  bandwidth  usage.  VP  reconfiguration  is  required  if  a 
call  blocking  rate  exceeds  the  call  level  QOS  objective.  The  VP  reconfiguration  process  is  also 
necessary  if  the  demand  forecast  for  an  upcoming  period^  cannot  be  accommodated  in  the  current 
configuration.  In  these  cases,  VP  planner  reconfigures  virtual  paths  so  as  to  satisfy  a  newly  pro¬ 
jected  VP-level  traffic  requirement.  Out  of  such  candidate  configurations,  the  VP  planner  seeks  the 
one  which  attains  the  best  network  survivability. 

When  a  new  VP  configuration  is  calculated,  the  change  must  be  made  gracefully.  Here  ‘grace¬ 
ful’  means  that  the  VP  reconfiguration  would  not  affect  the  connections  currently  in  service.  In 
order  to  avoid  abmpt  service  interruption  and  provide  service  transparency,  the  cell  sequence 
integrity  must  be  assured  for  all  VC’s  upon  VP  reconfiguration.  If  a  VP  route  stays  the  same,  this 
precaution  is  unnecessary;  VP  reconfiguration  can  be  accomplished  immediately  by  changing  its 
bandwidth.  However,  the  VP  reconfiguration  may  involve  a  radical  change  on  its  routing  assign¬ 
ment,  especially  upon  the  network-wide  restoration  after  a  failure.  In  such  a  case,  cells  may  arrive 
out  of  sequence  if  a  path  is  switched  over  without  any  precaution.  There  are  two  possible 

7.  The  demand  projection  could  be  calculated  from  the  current  traffic  load  and  the  demand  patterns  experienced  in  the 
past.  The  maximum  required  bandwidth  over  an  upcoming  period  is  reserved. 
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approaches  to  realize  graceful  VP  reconfiguration:  One  way  is  to  keep  an  old  virtual  path  until  all 
active  connections  are  closed.  All  newly  admitted  calls  are  routed  over  a  new  path,  and  the  VP 
bandwidth  is  gradually  reassigned  to  the  new  path  as  the  VP  bandwidth  over  the  old  path  is 
released.  Another  possible  approach  is  the  application  of  the  ATM  hitless  path  switching  mecha¬ 
nism  [76]  [82].  The  hitless  switching  mechanism  synchronizes  between  the  old  and  new  routes. 
The  difference  of  delays  between  the  two  routes  is  adjusted  by  cell  buffering  if  the  original  route  is 
longer  than  the  new  one^.  This  hitless  switching  mechanism  can  be  gradually  applied  to  virtual 
paths  involving  their  route  change.  Combination  of  the  above  two  approaches  would  give  a  further 
alternative,  where  long-lasting  connections  over  an  old  virtual  path  are  finally  switched  to  a  new 
path  by  the  hitless  path  switching  mechanism. 


Finally,  the  network  reconfiguration  also  occurs  at  the  FN  layer.  When  the  VP  planner  cannot 
maintain  survivability  measure  at  a  desired  level^,  the  FN  manager  is  invoked  to  design  a  surviv- 
able  network  in  a  cost-effective  manner.  This  network  reconfiguration  is  called  very  long-term 
reconfiguration.  Given  a  maximum  VP-level  traffic  demand  projected  over  some  future  time 
frame  (say  a  year  or  two),  the  FN  manager  seeks  the  physical  resource  assignment  which  guaran¬ 
tees  full  restorability  from  network  failure  at  a  minimum  cost.  This  reconfiguration  typically  hap¬ 
pens  on  a  very  long  time  scale,  say  over  years. 


Table  2-1  summarizes  various  types  of  the  network  reconfiguration  for  survivable  ATM  net¬ 
works. 

2.3.  Summary 


This  section  introduces  the  proposed  survivable  ATM  network  management  system.  A  surviv¬ 
able  ATM  network  management  architecture  integrates  survivability  functions  into  the  existing 
ATM  resource  management  system  at  the  FN  and  VP  layers.  In  order  to  achieve  a  higher  level  of 
network  survivability,  network  reconfiguration  is  performed  at  these  two  layers.  This  leads  to  the 
two  problems  discussed  in  the  following  two  sections:  the  SCFA  (survivable  capacity  and  flow 
assignment)  problem  and  the  SVPR  (survivable  virtual  path  assignment)  problem.  The  survivabil- 

8.  In  this  case,  out-of-sequence  cell  delivery  may  happen  without  buffering  since  a  cell  over  a  new  route  just  after  path 
switching  may  arrive  at  the  destination  earlier  than  a  cell  over  an  old  route  just  before  the  switching. 

9.  For  example,  when  full  restorability  is  not  assured  for  a  long  period  (say  a  few  months),  or  when  an  expected  lost 
flow  (survivability  measure)  averaged  over  months  grows  beyond  a  designated  level  (say  0.1  percent  of  the  average 
load).  The  statistics  on  the  attained  survivability  measure  can  be  collected  by  the  VP  manager. 
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ity  measure  is  introduced  to  control  resource  allocation  in  the  proposed  survivable  network  man¬ 
agement  system.  The  two-step  restoration  strategy  is  proposed  to  meet  two  contradicting 
requirements  of  virtual  path  restoration:  fast  restoration  and  optimal  flow  assignment.  Its  effective¬ 
ness  will  be  discussed  in  Section  IV. 

This  section  presents  several  implementation  techniques  of  the  fast  restoration  protocol.  The 
selection  of  a  restoration  strategy  completely  depends  on  the  application  environment.  The  follow¬ 
ing  criteria  must  be  considered  at  the  design  phase  of  a  survivable  ATM  network: 

1.  Link  installation  cost 

2.  Attainable  restorability  for  a  given  physical  resource 

3 .  Speed  of  the  VP  planner  module 

4.  Speed  of  the  VP  restoration  manager  module 

5.  Frequency  of  change  in  the  VP-level  traffic  demand 

6.  Implementation  cost  of  the  VP  manager 

This  project  clarifies  the-effectiyeness  of  each  restoration  scheme  with  respect  to  the  first  three  cri¬ 
teria.  Required  spare  capacity  cost  is  elaborated  in  the  next  section,  while  Section  IV  discusses  the 
attainable  restorability  of  each  scheme  as  well  as  the  applicability  of  the  proposed  VP  planner 
module  based  on  the  required  computation  time.  The  analysis  presented  in  this  project  would  help 
to  elucidate  the  applicable  VP  control  and  restoration  strategies.  The  results  can  be  applied  to  the 
decision  process  at  the  design  phase  of  a  survivable  ATM  network  management  system. 


Table  2-1 .  Various  types  of  network  reconfiguration  for  survivabie  ATM  networks 


Type  of  network  reconfiguration 

Descriptions 

Very  long-term  reconfiguration 

Reconfiguration  of  the  underlying  facility  network  at  the  FN 
layer  to  maintain  the  survivability  level.  Triggered  when  a  VP 
planner  cannot  maintain  a  survivability  measure  at  a  satisfac¬ 
tory  level,  mainly  due  to  a  growth  of  traffic  demand  over  a 
very  long  time  scale,  say  over  years. 

Long-term  reconfiguration 

Network-wide  optimal  VP  reconfiguration.  Triggered  mainly 
due  to  a  change  in  a  traffic  trend  at  a  longer  time  scale,  say 
over  hours  or  days.  Also  invoked  when  a  significant  topologi¬ 
cal  change  occurs,  including  one  due  to  a  failure. 

Short-term  reconfiguration 

Reconfiguration  due  to  traffic  fluctuation  for  minutes  to  hours. 
Use  a  dynamic  call  routing  mechanism  over  a  VP  subnetwork 
at  the  call  layer. 
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Ill  Survivable  Capacity  and  Flow  Assignment  (SCFA) 


The  SCFA  problem  arises  when  the  virtual  path  layer  cannot  maintain  the  network  survivability 
at  a  desired  level  due  to  a  growth  of  traffic  demand  over  months  or  years.  The  facility  network 
layer  invokes  the  FM  manager  to  plan  additional  network  resources  on  such  occasions.  This  sec¬ 
tion  develops  optimization  procedures  for  the  networks  based  on  two  different  restoration 
schemes;  line  restoration  and  end-to-end  restoration.  Capacity  and  flow  assignment  is  jointly  opti¬ 
mized  to  find  an  optimal  capacity  placement.  Several  mechanisms  are  proposed  to  reduce  the  com¬ 
putational  complexity  of  the  joint  optimization.  As  discussed  in  Section  II,  end-to-end  restoration 
schemes  have  been  considered  more  advantageous  than  line  restoration  schemes  because  a  fully 
restorable  network  could  be  realized  with  less  spare  capacity.  A  comparative  analysis  is  exten¬ 
sively  conducted  to  clarify  the  benefit  of  end-to-end  restoration  schemes  quantitatively  in  terms  of 
minimum  resource  installation  cost.  This  study  reveals  that  the  economical  gain  could  be  nominal 
for  a  well-connected  and/or  unbalanced  network. 

3.0.  Introduction 

The  SCFA  problem  aims  to  find  the  most  economical  capacity  placement  which  assures  a  fully- 
restorable  virtual  path  assignment  for  a  projected  traffic  demand.  Previously,  only  a  limited  num¬ 
ber  of  works  have  studied  an  optimal  spare  capacity  placement  problem  for  line  restoration-based 
self-healing  STM  networks  [37]  [68].  Given  the  number  of  working  transport  paths  per  link,  these 
studies  calculate  a  spare  link  assignment  with  minimum  installation  cost,  subject  to  the  constraint 
that  any  single-link  failure  can  be  restored  successfully.  A  flow  is  preassigned  to  each  link, 
although  there  is  no  guarantee  that  this  assignment  gives  the  most  cost-effective  capacity  place¬ 
ment.  As  for  end-to-end  restoration-based  networks,  no  published  work  on  the  optimal  capacity 
assignment  problem  was  found. 

We  propose  a  jointly  optimal  capacity  and  flow  assignment  procedure  in  order  to  find  a  globally 
optimal  solution  for  a  given  link  cost  function.  Since  there  is  mutual  dependency  between  the  link 
capacity  placement  and  the  survivable  flow  assignment  with  a  given  link  capacity,  the  obtained 


-30- 


solution  cannot  be  claimed  to  be  a  true  optimum  if  these  problems  are  treated  separately  [24]. 
However,  the  complexity  of  the  problem  grows  tremendously  with  joint  optimization.  Several 
mechanisms  are  developed  to  make  this  problem  computationally  more  tractable. 

Despite  the  high  complexity  of  its  rerouting  decision  process,  an  end-to-end  restoration  scheme 
has  been  widely  cited  to  be  more  advantageous  than  a  line  restoration  scheme  [5]  [11]  [19]  [43] 
[68]  [82].  The  former  scheme  could  effectively  use  the  spare  bandwidth,  and  thus  less  redundant 
capacity  would  be  necessary  to  construct  a  fully  restorable  network.  However,  no  comprehensive 
work  has  been  performed  on  the  quantitative  comparison  of  these  two  schemes  to  confirm  this 
assertion  for  various  network  environments.  The  only  available  numerical  results  so  far  are  based 
on  computer  simulation^  [5]  or  heuristics  [43]  using  a  single  sample  network.  However,  it  is  not 
clear  how  close  the  obtained  solution  is  to  the  optimum.  The  comparison  based  on  heuristics 
leaves  us  uncertain  as  to  whether  the  benefit  comes  from  the  end-to-end  restoration  scheme  or  the 
optimization  error. 

With  globally  optimal  solution  procedures  at  hand,  we  can  now  make  a  meaningful  comparison 
between  the  two  restoration  schemes.  A  comparative  study  is  performed  with  respect  to  the  mini¬ 
mum  spare  capacity  installation  cost  in  order  to  elucidate  the  economical  gain  due  to  the  end-to- 
end  restoration  schemes.  Several  networks  with  diverse  topological  characteristics  as  well  as  sev¬ 
eral  projected  traffic  demand  patterns  are  employed  in  the  experiments  to  see  the  effect  of  various 
network  parameters. 

The  rest  of  the  section  is  organized  as  follows:  Section  3.1  provides  the  formal  definition  of  the 
SCFA  problem  and  discusses  several  options  which  determine  the  problem  structure.  The  problem 
formulation  and  the  solution  approach  are  presented  in  Section  3.2  for  the  self-healing  networks 
based  on  line  restoration  protocol,  while  Section  3.3  is  devoted  to  those  based  on  end-to-end  resto¬ 
ration  protocol.  The  validity  of  the  proposed  algorithms  is  also  demonstrated  in  the  respective  sec¬ 
tions.  The  results  of  comprehensive  numerical  study  are  reported  in  Section  3.4,  followed  by  the 
concluding  remarks  in  Section  3.5. 


1.  No  details  are  provided  in  the  reference. 
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3.1.  Formal  Problem  Definition 


An  ATM  network  is  modeled  as  a  directed  graph  G  =  (V,A,c)  where  V  is  a  node  set  repre¬ 
senting  ATM  switches,  >1  is  a  set  of  directed  arcs  representing  optical  trunks,  and  c  =  (c^)  is  a 
vector  of  arc  capacity  (a  e  A).  Let  E  denote  a  set  of  undirected  links.  We  assume  that  the  network 
is  bidirectional  and  that  each  link  consists  of  two  directed  arcs  with  the  same  end-nodes  but  in 
opposite  directions  (i.e.  |A|  =  2  •  If'l ).  A  link  represents  a  complete  set  of  optical  fibers  installed 
between  two  nodes,  and  a  network  failure  (e.g.  a  complete  span  cut)  is  expressed  in  terms  of  a  link 
in  the  problem  formulation.  On  the  other  hand,  an  arc  embodies  a  set  of  transmission  media  going 
from  one  node  to  the  other  end  of  a  link,  and  the  capacity  and  flow  assignments  are  calculated  per 
arc. 


A  commodity  is  a  traffic  flow  from  an  origin  to  a  destination.  One  commodity  is  defined  for  each 
origin  and  destination  pair.  Let  IT  be  a  set  of  commodities  in  the  network  and  Q  =  (q^)  be  a  vec¬ 
tor  of  the  requested  VP-level  bandwidth  for  each  commodity  7t  €  11 .  Multiple  VP’s  are  estab¬ 
lished  for  each  commodity  to  satisfy  the  traffic  requirement.  It  is  assumed  that  each  commodity 
involves  a  significant  number  of  VP’s  and  thus  that  the  bandwidth  of  each  VP  is  considerably 
small  compared  to  .  This  is  a  reasonably  realistic  assumption  for  the  future  ATM  interoffice 
exchange  networks  where  a  very  large  volume  of  calls  with  a  wide  variety  of  services  are  aggre¬ 
gated.  A  VP  is  expected  to  have  a  bandwidth  on  the  order  of  a  few  Mbps  to  a  hundred  Mbps  [6], 
while  a  link  will  be  composed  of  multiple  optical  fiber  cables,  each  having  a  capacity  on  the  order 
of  ten  Gbps  to  Tbps  [56]  [66].  Based  on  the  above  assumption,  a  commodity  flow  as  well  as  a  res¬ 
toration  flow  can  be  simply  expressed  by  a  single  continuous  flow  variable  in  the  problem  formu¬ 
lation,  instead  of  using  an  individual  variable  per  VP.  This  considerably  reduces  the  complexity  of 
the  problem  while  maintaining  the  accuracy  of  the  solution.  Let  f  denote  a  vector  of  arc  flow. 

Now  the  SCFA  problem  can  be  formally  stated  as  the  following  optimization  problem: 


Given  V,AmdQ,  (SCFA) 

Minimize  Network  resource  installation  cost  D  (c) 
over  c  and  f , 

subject  to  a)  f  is  a  multicommodity  flow  satisfying  Q,  (flow  conservation  law) 
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b)  capacity  constraints  are  satisfied,  and  (capacity  constraints) 

c)  full  restorability  is  assured  through  fast  restoration. 

(full  restorability  constraints) 


The  full  restorability  constraints  differ  according  to  the  failure  events  considered  in  the  network 
design  as  well  as  the  traffic  rerouting  strategy  adopted  in  the  self-healing  network.  From  the  sur¬ 
vivability  viewpoint,  a  network  should  be  designed  to  fully  recover  from  as  many  failure  patterns 
as  possible.  However,  this  will  increase  not  only  the  network  installation  cost  but  also  the  com¬ 
plexity  of  the  optimization  procedure.  As  discussed  in  Section  2.2.3,  a  complete  span  cut  is  consid¬ 
ered  in  our  formulation  since  it  is  known  to  be  the  most  common  and  frequently  reported  failure 
event  over  the  decades  [82]. 

Tlie  full  restorability  constraints  further  differ  according  to  the  traffic  rerouting  strategies.  Three 
strategies  identified  in  Section  2.1.2  are  considered  in  the  SCFA  problem:  (1)  MF  rerouting  with 
line  restoration,  (2)  MF  rerouting  with  end-to-end  restoration  and  (3)  KSP  rerouting  with  line  res¬ 
toration.  In  the  following,  the  corresponding  SCFA  problems  are  referred  to  the  SCFA-MF-LINE, 
SCFA-MF-ETE  and  SCFA-KSP-LINE  problems,  respectively.  As  will  be  discussed  in  Section  4.4, 
the  KSP-based  formulation  suffers  from  non-convex  constraints.  The  MF-based  formulation,  on 
the  other  hand,  is  immune  to  this  problem,  and  the  constraints  can  be  written  as  a  set  of  linear 
equations  by  introducing  restoration  flow  variables.  In  the  following,  we  focus  on  the  MF-based 
systems,  and  their  globally  optimal  solutions  are  obtained  through  the  mathematical  programming 
techniques.  We  develop  a  heuristic  approach  for  the  SCFA-KSP-LINE  problem.  The  initial  solu¬ 
tion  of  the  SCFA-KSP-LINE  is  obtained  from  the  optimal  solution  of  the  SCFA-MF-LINE  prob¬ 
lem. 

3.2.  SCFA-MF-LINE 

3.2.1.  Problem  formulation 

The  intricate  math  is  shown  in  Appendix  A  in  order  to  preserve  the  flow  of  the  document. 

3.2.2.  Solution  approach 

The  intricate  math  is  shown  in  Appendix  B  in  order  to  preserve  the  flow  of  the  document. 
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3.2.3.  Validity  of  the  algorithm 


The  intricate  math  is  shown  in  Appendix  C  in  order  to  preserve  the  flow  of  the  document. 

3.2.4.  Extension  to  SCFA-KSP-LINE 

The  intricate  math  is  shown  in  Appendix  D  in  order  to  preserve  the  flow  of  the  document. 

3.3.  SCFA-MF-ETE 

3.3.1.  Problem  formulation 

The  intricate  math  is  shown  in  Appendix  E  in  order  to  preserve  the  flow  of  the  document. 

3.3.2.  Solution  approach 

The  intricate  math  is  shown  in  Appendix  F  in  order  to  preserve  the  flow  of  the  document. 

3.3.3.  Validity  of  the  algorithm 

The  intricate  math  is  shown  in  Appendix  G  in  order  to  preserve  the  flow  of  the  document. 

3.4.  Evaluations 

3.4.1.  Major  advantages  of  the  EXE  and  JOA  schemes 

The  required  resource  installation  cost  is  expected  to  decrease  with  joint  optimization  as  well  as 
end-to-end  restoration.  In  order  to  understand  the  causes  of  the  economical  benefit  due  to  these 
schemes,  we  first  analyze  the  results  based  on  the  small  sample  network  shown  in  Figure  3-1.  The 
two  optimal  assignment  approaches  are  employed  for  each  restoration  scheme:  Jointly  Optimal 
Assignment  (JOA)  and  Optimal  Capacity  Assignment  (OCA).  The  JOA  jointly  optimizes  the 
capacity  and  flow  assignment  using  the  techniques  proposed  in  the  preceding  two  sections,  while 
the  OCA  optimizes  the  capacity  placement  with  a  predetermined  commodity  flow  assignment.  The 
commodity  flow  is  fixed  to  be  the  shortest  route  flow^  in  our  experiment.  Since  the  shortest  route 
flow  requires  the  least  amount  of  capacity  to  satisfy  the  traffic  demand,  this  flow  assignment  must 

2.  A  shortest  route  flow  is  a  multicommodity  flow  where  each  commodity  is  routed  over  its  shortest  path  under  a  certain 
arc  cost  [22],  In  this  case,  an  arc  cost  is  defined  to  be  a  unit  capacity  cost  over  an  arc. 
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I 

be  the  best  choice  if  a  flow  must  be  predetermined^.  Thus,  a  comparative  study  between  the  JOA 
and  OCA  schemes  clarifies  the  benefit  of  the  joint  optimization.  The  capacity  and  flow  assignment 
Of  the  following  four  cases  are,  therefore,  studied  in  this  experiment:  1)  end-to-end  restoration 
with  JOA  (ETE-JOA),  2)  end-to-end  restoration  with  OCA  (ETE-OCA),  3)  line  restoration  with 
JOA  (LINE-JOA),  and  4)  line  restoration  with  OCA  (LINE-OCA).  The  traffic  demand  used  in  the 
experiment  as  well  as  the  resulting  commodity  flow  assignment  are  shown  in  Table  3-2.  Table  3-3 
summarizes  the  optimal  capacity  placement  for  each  scheme.  A  restoration  flow  assignment  is 
illustrated  for  two  failure  scenarios  in  Figures  3-2  and  3-3.  Three  major  causes  are  identified  for 
the  superiority  of  the  end-to-end  restoration  scheme. 

Cause  1.  With  the  line  restoration  scheme,  a  VP  may  suffer  backhauling  [5]  after  restoration. 

Moreover,  it  could  even  contain  a  loop.  On  the  other  hand,  end-to-end  restoration  is 
immune  to  these  problems,  and  thus  it  can  avoid  unnecessary  consumption  of  spare 
'  bandwidth. 

Consider  the  case  of  the  OCA  approach  where  the  flow  assignments  for  the  two  restoration 
schemes  are  identical.  Upon  a  failure  of  link  2,  the  <5,1>  commodity^  is  restored  over  the  path 
5-3-2-!^  with  end-to-end  restoration  (Figure  3-2-b).  Line  restoration  recovers  the  <5,1>  com¬ 
modity  over  5-4-5-3-2-1 .  Thus  a  flow  is  assigned  to  traverse  back  and  forth  between  nodes  5  and 
4  (Figure  3-2-a).  This  phenomenon  is  called  backhauling  [5],  and  the  spare  bandwidth  over  5-4- 


Figure  3-1,  A  small  sample  network 


3.  This  hypothesis  is  further  confirmed  by  the  following  result  with  the  JOA-based  assignment:  At  the  optimum,  virtu¬ 
ally  all  commodities  are  served  via  a  single  route  to  carry  their  traffic  demand,  and  only  a  few  use  multiple  routes. 
Although  such  a  route  is  not  necessarily  the  shortest  one,  most  flows  are  routed  over  the  least  expensive  path.  This  sug¬ 
gests  that  the  shortest  path  is  the  best  choice  for  most  commodities,  and  thus  a  shortest  route  flow  is  an  obvious  choice  if 
a  flow  must  be  predetermined. 

4.  A  commodity  from  a  source  node  a  to  a  destination  node  b  is  referred  to  as  commodity  <a,b>. 

5.  a-b-c  denotes  a  path  from  nodes  a  to  c  through  b. 
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LINE  ETE  LINE  ETE 
OCA  OCA  JOA  JOA 


Table  3-1.  Demand  and  commodity  flow  assignment  in  the  small  sample  network 


Src. 

Dest. 

Demand 

ETE 

JOA 

LINE 

JOA 

OCA 

Route 

1 

2 

MM 

mim 

^BlgBB||||| 

2 

1 

200 

200 

200 

1 

3 

lOO 

100 

100 

100 

1  ->  2  ->  3 

3 

1 

100 

100 

100 

100 

3  ->  2  ->  1 

■■ 

4 

mmm 

BIOB 

B 

1 

300  1 

300 

300 

300 

H 

5 

mm 

100 

100 

mm 

1  ->  4  ->  5 

1  ->  2  ->  5 

B 

1 

100  1 

100 

100 

5  ->  4  ->  1 

3 

IBOIiB 

B9IIB 

mm 

mim 

3 

2 

200  1 

1  200 

200 

200 

mbsemm 

2 

4 

WTQTIM 

950 

1,000 

m 

50 

4 

2 

\  500 

500 

500 

2 

5 

MM 

mm 

5 

2 

Bl 

Bl 

200 

■■ 

4 

tTiai 

■nm 

BkOTIB 

mm 

3->4 

B 

3 

300 

200 

300 

Hm 

4->3 

B 

100 

4  ->  5  ->  3 

■■ 

5 

100 

100 

mim 

lOO 

B 

3 

100 

100 

Mim 

100 

■■ 

BIOB 

Him 

BaljB 

B 

B 

300  1 

1  300 

300 

300 

The  demand  and  commodity  flow  assignment  is  given  in  terms  of  BU  (bandwidth  unit).  The  BU  is 
an  arbitrary  unit  of  bandwidth,  say  Mbps. 


Table  3-2.  Optimal  arc  capacity  assignment  in  the  small  sample  network 


ISI 

lai 

IBI 

lai 

1^ 

e&i 

isa 

la 

ISI 

m 

121 

EH 

ESI 

EH 

EH 

IM^I 

Uil 

m 

i^si 

IHjl 

|gi| 

m 

m 

ua 

ua 

121 

121 

121 

TotaU 

mill 

mm 

mm 

mill 

70(r 

jmm 

mm 

BIHH 

mm 

EililiJ 

mm 

mm 

■anil 

lauM 

mm 

E£ID| 

|o 

Flow'* 

400 

600 

600 

400 

300 

700 

200 

hSI 

200 

100 

200 

400 

400 

w  ^ 

300 

100 

100 

300 

gnm 

100 

200 

300 

mill 

PIlIll 

PIlIll 

mm 

um 

■ifliii 

EMM 

HUH 

EMM 

mm 

KM!] 

W 

^50" 

mm 

Enm 

RMM 

9  < 
s  O 

400 

300 

300 

400 

350 

300 

950 

300 

200 

350 

Kiiil 

100 

100 

300 

400 

j 

300 

600 

600 

300 

200 

Klin 

150 

200 

400 

200 

150 

3765 

mm 

mm 

im 

EMM 

mini: 

liinM 

EMM 

RMM 

EMi] 

RMM 

mil 

RMM 

RMM 

EMM 

vnm 

^  u 

300 

400 

400 

300 

300 

1000 

500 

200 

200 

300 

IKra 

100 

400 

400 

5820 

U  O 

400 

600 

300 

150 

150 

100 

250 

150 

50 

250 

150 

250 

3940 

W 

Total 

mm 

mm 

iPiiii: 

mm 

Elllll 

mini: 

mm 

Elllll 

mm 

EiinM 

im 

mm 

EinM 

E&Ii] 

ipiatii 

Z  u 

Flow 

300 

300 

400 

400 

K 

300 

1000 

500 

200 

200 

300 

300 

100 

100 

400 

400 

5820 

J  o 

Spare 

400 

600 

600 

300 

100 

200 

200 

100 

200 

200 

300 

200 

200 

3990 

a.  Total  installation  cost  (CU).  CU  (cost  unit)  is  an  arbitrary  unit  of  arc  capacity  installation  cost. 

b.  A  unit  arc  cost  (CU/BU). 

c.  Total  capacity  of  an  arc  (BU). 

d.  Aggregate  arc  flow.  (BU) 

e.  Spare  capacity  of  an  arc.  (BU) 
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5  is  wasted.  In  our  experiment  with  a  larger  network,  line  restoration  could  generate  a  loop  in  the 
middle  of  a  rerouted  VP.  This  looping  further  deteriorates  the  performance  of  line  restoration, 
although  it  is  a  relatively  rare  event  compared  to  backhauling. 

Cause  2.  End-to-end  restoration  can  geographically  distribute  the  effect  of  failure  over  the  net¬ 
work  and  lead  to  a  higher  level  of  sharing  in  spare  capacity  among  failure  events. 

The  more  spare  bandwidth  is  shared  for  restoration  from  different  failure  scenarios,  the  less 
spare  capacity  is  needed  in  total.  End-to-end  restoration  could  attain  a  higher  Degree  of  Sharing 
in  Spare  Capacity  (DSSC)  than  line  restoration  since  alternate  paths  could  be  dispersed  in  a  geo¬ 
graphically  wider  area.  On  the  other  hand,  all  alternate  paths  for  the  line  restoration  scheme  must 
be  terminated  at  the  end-nodes  of  a  failed  link.  Thus,  required  spare  capacity  would  be  more 
localized  for  each  failure,  and  redundant  bandwidth  could  be  shared  less  among  different  link 
failure  scenarios.  For  example,  consider  again  a  failure  of  link  2  and  the  restoration  due  to  the 
LINE-OCA  as  well  as  the  ETE-OCA  (Figure  3-2).  In  order  to  recover  a  400  BU^  flow  on  arc  4- 


Figure  3-2.  Restoration  flow  of  arc  4-1 
upon  a  failure  of  link  2 


“  (arc  spare  capacity) 


—  <commodity  flow> : - 1> 


All  numbers  in  the  figure  are  given  in  terms  of  BU  (bandwidth  unit).  - restoration  fiow - ► 


6.  BU  (bandwidth  unit)  is  an  arbitrary  unit  of  bandwidth,  say  Mbps. 


1,  the  line  restoration  scheme  requires  a  400  BU  spare  capacity  in  total  over  the  arcs  leaving 
node  4.  On  the  other  hand,  the  end-to-end  restoration  scheme  uses  100  BU  less  spare  capacity  in 
total  since  the  affected  path  5-4-1  is  rerouted  over  a  path  away  from  node  4  (5-3-2- 1).  This  extra 
bandwidth  prepared  for  line  restoration  turns  out  to  be  unnecessary  for  other  failure  scenarios, 
implying  less  efficient  usage  of  the  redundant  capacity. 

Cause  3.  The  end-to-end  restoration  scheme,  especially  when  combined  with  JOA,  can  share 
spare  capacity  more  wisely  by  reclaiming  the  working  capacity  of  affected  VP’s  for 
restoration  purposes. 

Generally  speaking,  the  links  adjacent  to  a  vertex  with  node  degree  2  usually  require  relatively 
large  amounts  of  spare  capacity  for  line  restoration.  Upon  a  failure  of  either  link,  all  flow  on  the 
link  must  be  restored  over  the  other  link.  Since  there  is  no  other  alternative,  increasing  a  flow  on 
such  links  requires  twice  as  much  capacity  installation  for  its  primary  use  as  well  as  for  the  res¬ 
toration.  Therefore,  the  LINE-JOA  scheme  usually  avoids  routing  a  flow  via  such  a  'trap'  node. 
In  our  example,  no  flow  is  routed  through  node  1 ,  and  only  demand  originating  or  terminating  at 
the  node  is  assigned  over  its  adjacent  links  (Table  3-2).  These  links  still  require  twice  as  much 
bandwidth  as  the  total  demand  coming  into  and  going  out  of  the  node.  A  large  part  of  their  spare 
capacity  is  typically  prepared  for  the  recovery  from  a  failure  of  the  adjacent  link.  Thus,  spare 
capacity  would  be  inefficiently  allocated  on  the  links  adjacent  to  a  trap  node  with  line  restora¬ 
tion. 

By  routing  more  flow  over  a  trap  node,  the  end-to-end  restoration  scheme  could  reduce  the 
total  capacity  installation  cost.  Since  the  ETE  scheme  can  reclaim  the  working  capacity  of 
affected  virtual  paths  for  restoration  purposes,  link  capacity  can  be  used  for  an  original  flow 
assignment  as  well  as  for  restoration.  This  capacity  sharing  is  especially  useful  at  the  links  where 
a  large  amount  of  spare  capacity  is  required,  such  as  the  links  around  a  trap  node.  Let  us  explain 
the  case  with  the  example  of  the  small  network.  With  the  ETE-JOA  scheme,  30  percent  of  the 
<2,4>  demand  is  conveyed  over  the  path  2-1-4  without  increasing  the  total  capacity  over  links  1 
and  2  (Tables  3-2  and  3-3).  Upon  a  failure  of  link  1,  this  traffic  is  rerouted  over  2-5-4,  and  its 
working  capacity  of  3(K)  BU  over  arc  1-4  is  released  (Figure  3-3).  A  flow  of  400  BU  previously 
originated  from  node  1  towards  node  2  is  now  restored  over  arc  1-4  by  claiming  the  capacity 
which  was  occupied  by  the  <2,4>  demand.  Thus,  the  required  spare  capacity  on  arc  1-4  is  only 
100  BU  although  a  flow  of  400  BU  must  be  restored  upon  the  failure.  As  a  result  of  this  routing 
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and  restoration  strategy,  we  can  reduce  the  working  capacity  of  commodity  <2,4>  over  arc  2-4 
by  300  BU. 

Reclaiming  of  working  capacity  does  not  necessarily  happen  around  a  trap  node.  In  the  above 
example,  the  <2-4>  commodity  over  path  2-1-4  is  switched  to  2-5-4  upon  a  failure  of  link  1. 
However,  only  200  BU  spare  capacity  is  allocated  over  arc  2-5  while  the  rerouted  bandwidth 
amounts  to  300  BU.  An  additional  100  BU  for  the  restoration  comes  from  the  working  capacity 
which  was  occupied  by  the  <1,5>  demand  over  1-2-5  and  released  upon  the  failure.  The  reclaim¬ 
ing  effect  is  further  observed  in  the  <4,3>  commodity  where  a  third  of  its  demand  is  routed  over 
a  longer  path.  In  summary,  the  ETE-JOA  approach  can  realize  economical  capacity  allocation 
by  using  part  of  a  link  capacity  for  both  an  original  flow  assignment  and  a  restoration  from  a  fail¬ 
ure  scenario. 

The  JOA  can  save  resource  installation  cost  over  the  OCA  for  the  following  reason: 

Cause  4.  The  JOA  scheme  can  adjust  a  flow  so  that  the  spare  capacity  can  be  efficiently  shared 
to  reduce  the  total  capacity  installation  cost. 


Other  than  those  shown  in  the  figure, 


The  ETE-JOA  scheme  is  employed  in  this  example.  All  numbers  in 
the  figure  are  given  in  terms  of  BU  (bandwidth  unit). 
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It  is  worthwhile  to  note  that  the  minimum  working  capacity  cost^  is  attained  by  the  flow 
assignment  due  to  the  OCA  since  the  shortest  route  flow  is  used.  The  JOA  requires  at  least  equal 
and  typically  more  working  capacity  cost.  By  sharing  the  spare  capacity  more  efficiently,  how¬ 
ever,  the  JOA  could  attain  reduction  on  the  total  capacity  cost*.  For  example,  compare  the 
results  of  the  LINE- JO  A  and  the  LINE-OCA.  The  <1,5>  demand  is  routed  over  1-2-5  with  the 
JOA  rather  than  its  shortest  path  1-4-5  (Table  3-2).  Although  this  results  in  50  CU^  more  work¬ 
ing  arc  cost,  the  spare  cost  of  arc  1-4  is  reduced  by  100  CU,  leading  to  a  cost  reduction  of  50  CU 
(Table  3-3).  Due  to  the  relatively  large  demand  for  <2,4>,  more  redundant  bandwidth  is  installed 
over  2-1-4  than  4-1-2.  Thus,  it  is  beneficial  to  adjust  a  flow  over  links  1  and  2  so  that  more  spare 
capacity  would  be  required  over  2-1-4  than  4-1-2  upon  a  failure  of  the  links.  Thus,  the  JOA 
assigns  the  <1,5>  commodity  over  1-2-5  instead  of  1-4-5. 

3.4.2.  Effect  of  network  topology 

Further  experiments  have  been  carried  out  to  analyze  the  effectiveness  of  the  end-to-end  restora¬ 
tion  and  joint  optimization  in  a  wide  variety  of  network  environments.  Eight  sample  networks  with 
diverse  topological  characteristics  have  been  explored  in  the  experiment.  Four  of  them  are  real  net¬ 
works  which  had  appeared  in  the  literatures  [24]  [35]  [80]  [83]  (Figure  3-4).  The  other  four  are 
artificial  networks  (Figure  3-5).  In  the  experiments,  two  commodities  are  assumed  to  be  defined 
between  any  node  pair,  one  for  each  direction.  The  following  projected  traffic  demand  patterns 
have  been  employed:  A  uniform  (UF)  demand  pattern  with  1,000  BU  between  any  node  pair,  and  a 
weighted  (WT)  demand  pattern  with  1,000  BU  between  any  adjacent  nodes  and  500  BU  for  the 
others^®.  Furthermore,  random  (RD)  demand  patterns,  where  the  demand  of  each  commodity  is 
uniformly  distributed  between  250  BU  and  1,750  BU  are  considered.  The  JOA  and  OCA  are 
investigated  for  each  restoration  scheme. 

The  result  is  expressed  in  terms  of  a  normalized  spare  capacity  cost  (AC),  which  is  defined  as 
follows:  Let  D*  be  the  minimum  link  cost  required  to  support  the  traffic  demand.  Obviously,  the 

7.  The  working  capacity  cost  is  defined  to  be  the  cost  of  the  capacity  required  to  satisfy  the  traffic  demand.  This  cost  is 
determined  by  a  commodity  flow  assignment. 

8.  The  total  capacity  cost  for  a  fully  restorable  network  is  given  as  the  sum  of  the  working  capacity  cost  and  the  cost  of 
the  spare  capacity. 

9.  CU  (cost  unit)  is  an  arbitrary  unit  of  arc  capacity  installation  cost.  In  this  example,  an  arc  installation  cost  per  one  BU 
is  given  in  terms  of  CU  (refer  to  Table  3-3). 

10.  A  weighted  demand  pattern  comes  from  the  assumption  that  more  traffic  would  be  generated  between  adjacent 
nodes  than  others. 
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(e)  (12,50)  network 
DA=2.72,  DE=3.47 


(f)  (13,40)  network 
DA=1.9,  DE=2.65 


Figure  3-5.  Sample  networks  2.  Artificial  networks 

The  dotted  lines  represent  the  axes  of  symmetry. 

DA  =  The  average  number  of  disjoint  arc  restoration  paths 
DE  =  The  average  number  of  disjoint  end-to-end  paths 


nunimuni  is  att^n&d  when  all  demands  are  sent  over  their  shortest  routes.  Let  Z)^.  be  the  total  link 
cost  obtained  using  a  fully  restorable  capacity  placement  scheme  T.  Now,  define  the  required  spare 
capacity  cost  of  T  as  an  additional  cost  necessary  to  realize  full  restorability  with  the  scheme, 
which  is  given  by  {Dj-D*) .  The  normalized  spare  capacity  cost  (NO)  is  obtained  through  the 
normalization  of  the  above  cost  with  respect  to  D*,  which  is  given  by  100  •  (Dj-D*)  /D*  %. 
Note  that  the  additional  cost  is  not  necessarily  equal  to  the  cost  of  the  actual  spare  bandwidth 
because  a  flow  assignment  may  be  different  from  a  shortest  route  flow  in  case  of  joint-optimiza¬ 
tion.  Figure  3-6  summarizes  the  required  spare  capacity  cost  for  each  optimization  scheme. 

First  of  all,  the  demand  pattern  does  not  have  a  significant  influence  on  the  effectiveness  of  each 
optimization  scheme  (Figure  3-6-a~d).  As  expected,  the  JOA  requires  less  spare  capacity  than  the 
OCA,  and  the  end-to-end  restoration  outperforms  the  line  restoration  in  terms  of  spare  capacity 
cost.  The  improvement  ratio  appears  to  be  almost  consistent  among  different  demand  patterns, 
although  the  required  spare  capacity  cost  differs  accordingly.  On  the  other  hand,  network  topology 
turns  out  to  have  a  significant  impact  on  the  required  spare  capacity  cost.  In  particular,  the  degree 
of  improvement  due  to  end-to-end  restoration  over  line  restoration  depends  considerably  on  the 
network  model  (Figure  3-6-e),  while  the  benefit  due  to  joint  optimization  (JOA  over  OCA)  turns 
out  to  be  less  sensitive  to  the  network  topology  (Figure  3-6-a). 

Then  what  kind  of  network  topology  obtains  a  more  economical  benefit  by  using  an  end-to-end 
restoration  scheme?  Before  answering  this  question,  we  first  consider  the  three  major  advantages 
of  the  end-to-end  restoration  scheme  discussed  in  Section  3.4.1  and  examine  their  effect  on  the 
total  redundant  capacity  cost.  This  analysis  is  used  for  further  discussion  about  the  effect  of  net¬ 
work  topology  later.  The  eight  sample  networks  with  the  UF  and  WT  traffic  demands  are 
employed  in  the  experiments. 

First  of  all,  we  investigate  the  effect  of  geographical  distribution  of  alternate  routes  on  the 
required  spare  capacity  cost.  As  discussed  in  the  previous  section,  the  ETE  scheme  is  expected  to 
share  more  spare  capacity  among  different  failure  scenarios  since  the  restoration  paths  could  be 
widely  distributed  (Cause  2).  Thus,  the  scheme  could  reduce  the  spare  capacity  installation.  In  the 
experiment,  the  degree  of  sharing  in  spare  capacity  (DSSC)  is  measured  by  the  average  spare 
capacity  usage  per  failure,  W,  which  is  given  by: 
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Required  Coet  (NC)  Required  Cost  (NC)  Required  Spare  capacity  cost  (NC) 


Optimization  Scheme 


Optimization  Scheme 


(a)  UF  demand 


(b)  WT  demand 


ETE  JOA  ETE  OCA  LINE  JOA  UNE  OCA 


Optimization  Scheme 

(c)  NJ-LATA  -  UF+WT+8  RD’s 


ETE  JOA  ETE  OCA  LINE  JOA  UNE  OCA 

Optimization  Scheme 


(d)  US-WAN  -  UF  +  WT+  8  RD’s 


Optimization  Scheme 


(e)  UF  demand 


Figure  3-6.  Required  spare  capacity  cost  for  each  scheme 

Required  spare  capacity  cost  is  given  in  terms  of  a  normalized  spare  capacity  cost  (NC).  The 
figures  (a)  and  (e)  show  the  same  result,  but  the  orders  of  the  optimization  schemes  in  the  x-axis  are 
different.  The  former  is  intended  to  emphasize  the  difference  by  the  JOA  and  OCA  schemes,  while 
the  latter  attempts  to  contrast  the  ETE  and  LINE  schemes. 
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w= 

/e  E 

a  £  A,  «  € 

where  A,  s  A\  {a  :  a  g  /}  •  Note  that  r  is  zero  with  line  restoration.  gives  a  fraction  of  the 
total  spare  capacity  (weighted  by  arc  cost)  which  is  actually  consumed  upon  a  failure  of  link  /.  Wis 
the  average  of  over  all  failure  scenarios. 

Figure  3-7  depicts  the  relationship  between  W  and  the  normalized  spare  capacity  cost  for  the 
EXE  and  LINE  schemes.  The  line  in  the  figure  is  the  least-square  line  over  all  sample  points.  The 
result  indicates  that  the  required  spare  capacity  cost  is  largely  determined  by  the  attainable  DSSC 
for  end-to-end  restoration.  Such  a  relationship  also  holds  for  the  line  restoration,  but  it  has  a  few 
remarkable  exceptions.  The  (24,60)  network  requires  considerably  more  spare  bandwidth  com¬ 
pared  to  their  attainable  DSSC,  while  the  NJ-LATA  network  needs  relatively  less  spare  bandwidth. 

Backhauling  and  looping  are  identified  as  other  factors  that  cause  the  inefficient  use  of  spare 
bandwidth  in  the  line  restoration  scheme  (Cause  1  in  the  previous  section).  We  next  investigate  the 
effect  of  backhauling  and  looping  on  the  required  spare  bandwidth  for  the  line  restoration-based 
networks.  The  effect  is  measured  by  the  average  wasted  spare  bandwidth  cost  per  failure  due  to 
backhauling  and  looping,  BL.  It  is  defined  by: 


BL=  BL/|E| 
Ze  E 


where  bi  is  the  wasted  bandwidth  of  arc  a  due  to  backhauling  or  looping  upon  line  restoration 
a 

from  a  failure  of  link  BL,  is  a  fraction  of  the  wasted  spare  bandwidth  (weighted  by  arc  cost) 
over  the  bandwidth  actually  used  for  the  restoration  of  link  1.  BL  is  the  average  over  all  single-link 
failures. 


11.  The  amount  of  wasted  bandwidth  depends  on  how  the  virtual  paths  are  restored  over  the  restoration  paths.  In  the 
example  of  Figure  3-2-(a),  a  virtual  path  5-4-1  results  in  backhauling  if  it  is  restored  over  5-4-5-3-2-1,  while  the  path 
does  not  suffer  from  backhauling  if  it  is  rerouted  oveF  5-4-2- 1.  The  output  of  the  SCFA-MF-LINE  problem,  however, 
does  not  specify  which  virtual  paths  are  rerouted  over  which  restoration  paths.  In  the  calculation  of  bl^ ,  it  is  assumed 
that  the  bandwidth  of  the  affected  traffic  is  uniformly  distributed  over  the  restoration  paths. 
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Figure  3-7.  The  effect  of  attainable  DSSC  on  required  spare  capacity  cost 

Two  demand  patterns,  UF  and  WT,  are  utilized  for  each  sample  network.  The  spare  capacity  cost  is 
expressed  in  terms  of  a  normalized  spare  capacity  cost  (NC).  The  line  in  the  figures  is  the  least-square 
line  over  all  samples.  The  closer  all  sample  points  are  to  the  line,  the  closer  the  relationship  observed 
between  required  spare  capacity  cost  and  attainable  DSSC. 
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Figure  3-8  illustrates  the  effect  of  backhauling  and  looping  on  the  required  spare  capacity  cost. 
The  line  in  the  figure  is  the  least-square  line  over  all  sample  points.  The  result  shows  that  the 
degree  of  backhauling  and  looping  somewhat  varies  by  network  topology,  and  it  has  a  very  loose 
relationship  to  the  required  spare  capacity  cost.  The  wasted  bandwidth  [BL]  is  nominal  in  the  NJ- 
LATA  network,  which  accounts  for  the  lower  spare  capacity  requirement  compared  to  its  attain¬ 
able  DSSC  in  Figure  3-7-(c,d).  On  the  other  hand,  the  (24,60)  network  ends  up  with  the  most 
waste,  requiring  a  relatively  higher  spare  capacity  installation  as  in  Figure  3-7-(c,d). 


The  end-to-end  restoration  scheme  can  further  reduce  the  spare  capacity  cost  by  reclaiming  the 
bandwidth  of  affected  VP’s  for  restoration  purpose  (Cause  3  in  the  previous  section).  The  average 
reclaimed  spare  bandwidth  cost  per  failure,  RC,  is  explored  to  clarify  its  dependency  on  the 
required  spare  capacity  cost.  RC  is  given  by: 

RC=Y,  RC/\E\ 

leE 
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Figure  3-8.  The  effect  of  backhauling  and  looping  on  required  spare  capacity  cost 


Two  demand  patterns,  UF  and  WT,  are  employed  for  each  sample  network.  The  spare  capacity  cost  is 
expressed  in  terms  of  a  normalized  spare  capacity  cost  {NC).  The  line  in  the  figures  is  the  least-square  line 
over  all  samples. 
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ae  Ai  ae  A, 

where  is  the  total  amount  of  reclaimed  bandwidth  of  arc  a  upon  a  failure  of  link  1.  The  band- 

I 

width  released  and  reclaimed  by  the  same  affected  VP  is  excluded  in  this  calculation  .  Thus,  r'^ 
is  not  equal  to  .  /?C;  is  the  percentage  of  the  reclaimed  bandwidth  (weighted  by  arc  cost)  over 
the  total  bandwidth  spent  for  the  restoration  from  a  failure  of  link  I,  and  RC  is  its  average  over  all 
failure  scenarios. 

Figure  3-9  shows  the  result  of  this  experiment.  The  required  spare  capacity  cost  is  plotted 
against  RC.  The  result  indicates  that  the  effect  of  bandwidth  reclaiming  does  not  appear  to  have 
any  explicit  relationship  to  the  required  spare  capacity  cost.  This  suggests  that  it  is  not  a  ruling  fac¬ 
tor  for  the  effectiveness  of  the  end-to-end  restoration  scheme. 
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Figure  3-9.  The  effect  of  bandwidth  reciaiming  on  required  spare  capacity  cost 

Two  demand  patterns,  UF  and  WT,  are  employed  for  each  sample  network.  The  spare  capacity  cost  is 
expressed  in  terms  of  a  normalized  spare  capacity  cost  (NC).  The  line  in  the  figures  is  the  least-square 
line  over  all  samples. 


12.  For  example,  suppose  that  a  virtual  path  5-4-1  in  Figure  3-1  is  rerouted  over  a  restoration  path  5-4-2- 1  upon  a  failure 
of  link  2.  Upon  restoration,  the  bandwidth  over  arc  5-4  is  released  and  reclaimed  by  this  virtual  path.  Since  this  type  of 
bandwidth  reclaiming  by  the  ETE  scheme  does  not  give  any  advantage  over  the  LINE  scheme,  the  amount  of  the  band¬ 
width  reclaimed  in  this  manner  is  excluded  from  the  calculation  of  r  . 

a 
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In  summary,  the  degree  of  sharing  in  spare  capacity  (DSSC)  turns  out  to  be  a  major  determining 
factor  of  spare  capacity  cost,  while  the  backhauling  and  looping  effect  also  contributes  to  its  deter¬ 
mination  for  networks  based  on  the  line  restoration  scheme. 

Next,  we  examine  how  network  topology  influences  these  factors  and  determines  the  required 
spare  capacity  cost.  A  network  can  be  characterized  by  the  two  topological  aspects:  connectivity 
and  connection  regularity.  We  find  that  these  two  measures  can  reflect  the  effect  of  the  two  key 
factors  in  determining  the  redundant  capacity  cost;  the  DSSC  and  the  backhauling.  Connectivity 
expresses  how  well  nodes  are  connected  in  a  network.  It  is  expected  that  a  well-connected  network 
requires  less  spare  capacity  than  a  sparse  network,  since  the  former  would  have  more  candidate 
restoration  paths  than  the  latter.  In  the  following  discussion,  the  average  number  of  disjoint  arc  res¬ 
toration  paths,  DA,  is  employed  as  the  measure  of  the  connectivity  for  the  networks  based  on  the 
line  restoration  scheme  .  It  is  defined  by: 

DA=  DA/\A\ 

ae  A 

where  DA^  is  the  number  of  disjoint  (i.e.  mutually  exclusive)  restoration  paths  for  arc  a.  The  more 
disjoint  restoration  paths,  the  higher  the  DSSC  that  can  be  attained,  since  the  effect  of  failure  can 
be  distributed  over  more  restoration  paths.  As  for  the  network  based  on  the  end-to-end  restoration 
scheme,  the  average  number  of  disjoint  end-to-end  paths,  DE,  is  employed  as  the  measure  of  the 
connectivity.  It  is  defined  by: 

DE=  ^  DE/\n\ 

n 

where  DE^  is  the  number  of  disjoint  paths  between  the  source  and  destination  nodes  of  commod¬ 
ity  K. 

Connection  regularity  is  another  important  topological  measure.  The  measure  should  express 
how  evenly  the  links  are  connected  over  a  network  and  how  symmetrically  the  connection  is 
arranged  in  a  network.  If  a  connection  is  well-balanced,  it  is  expected  that  the  affected  VP’s  can  be 
distributed  over  the  network  more  evenly,  which  could  promote  attainable  DSSC  among  different 


13.  An  average  node  degree  can  also  be  used  for  the  connectivity  measure.  A  degree  of  node  v  is  defined  to  be  the  num¬ 
ber  of  links  incident  with  v.  An  average  node  degree  is  its  average  over  all  nodes  in  a  network.  However,  we  find  that  DA 
is  a  more  suitable  metric  for  the  network  connectivity  than  the  average  node  degree  because  the  former  represents  a  glo¬ 
bal  connectivity  between  the  end-nodes  of  a  failed  link,  while  the  latter  only  captures  a  local  view  at  a  node. 
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failure  scenarios.  A  major  obstacle  here  is  that  the  measure  is  very  hard  to  quantify.  In  the  follow¬ 
ing  discussion,  we  classify  eight  sample  networks  into  two  categories:  balanced  networks  and 
unbalanced  networks.  A  good  measure  for  the  connection  regularity  needs  to  be  considered  further 
in  future  works.  The  eight  sample  networks  can  be  categorized  as  follows: 

1.  Balanced  networks:  (12,50),  (16,56),  (13,40)  and  (24,60)  networks  in  decreasing  order  of 
network  connectivity. 

2.  Unbalanced  networks:  NJ-LATA,  LATA,  US-WAN,  and  AREA  networks  in  decreasing  order 
of  network  connectivity. 

First  consider  the  case  of  end-to-end  restoration.  Figure  3-10-a,b  depicts  the  relationship 
between  connectivity  {DE)  and  the  required  spare  capacity  cost.  The  solid  line  in  the  figure  is  the 
least-square  line  over  all  sample  points,  while  the  dashed  line  is  the  least-square  line  over  all  sam¬ 
ples  of  the  balanced  networks,  and  the  dotted  line  is  that  over  all  samples  of  the  unbalanced  net¬ 
works.  The  figure  shows  that  all  balanced  sample  networks  can  require  less  spare  capacity  than 
any  unbalanced  model  networks*'*.  For  example,  the  (24,60)  network  requires  less  spare  band¬ 
width  than  the  LATA  network,  although  the  former  has  a  lower  level  of  network  connectivity  than 
the  latter.  This  indicates  that  connection  regularity  is  a  major  factor  in  determining  the  required 
spare  capacity. 

Among  the  balanced  networks,  well-connected  networks  can  generally  require  less  spare  capac¬ 
ity  cost.  Since  a  network  with  a  higher  connectivity  has  more  disjoint  restoration  paths,  it  can  dis¬ 
tribute  the  effect  of  a  failure  to  a  wider  area,  resulting  in  a  higher  DSSC.  The  only  exception  is  the 
(13,40)  network,  which  requires  a  slightly  higher  amount  of  redundant  capacity  than  the  (24,60) 
network  in  spite  of  its  higher  connectivity.  This  is  because  the  (13,40)  network  is  less  symmetrical 
than  the  (24,60)  network*^.  This  suggests  that  regularity  has  a  significantly  larger  impact  on  the 
required  capacity.  The  same  argument  on  the  effect  of  connectivity  can  apply  for  the  unbalanced 
networks,  but  it  holds  only  weakly.  A  possible  cause  is  the  difference  in  the  regularity  level  among 
those  networks.  In  summary,  connection  regularity  turns  out  to  be  an  important  factor  in  deciding 
the  redundant  capacity  cost  of  fully  restorable  networks  with  end-to-end  restoration,  while  connec¬ 
tivity  plays  a  minor  role  in  the  determination  of  cost. 


14.  As  shown  in  Figure  3-7-(a,b),  all  balanced  networks  can  attain  a  higher  DSSC  than  any  unbalanced  network. 

15.  The  (13,40)  network  is  symmetric  along  only  one  axis,  while  the  (24,60)  network  is  symmetric  along  two  axis  (see 
the  dotted  lines  representing  the  axes  of  symmetry  in  Figure  3-5). 
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Required  spare  capacity  cost  (NC)  Required  spare  capacity  cost  (NC) 


(c)MF-LINE/JOA 


(d)MF-LINE/OCA 


Figure  3-10.  Topological  effect  1.  Effect  of  Connectivity 

Two  demand  patterns,  UF  and  WT,  are  employed  for  each  sample  network.  The  spare  capacity 
cost  is  expressed  in  terms  of  a  normalized  spare  capacity  cost  (.NC).  The  solid  line  in  the  figures  is 
the  least-square  line  over  all  samples,  while  the  dashed  line  is  the  least-square  line  over  all  samples 
for  the  balanced  networks,  and  the  dotted  line  is  that  over  all  samples  for  the  unbalanced  networks. 
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Next,  consider  the  case  of  line  restoration  (Figure  3-10-c,d).  The  figure  shows  that  balanced  net¬ 
works  require  less  spare  capacity  than  unbalanced  networks  if  they  have  a  similar  level  of  connec¬ 
tivity.  For  example,  although  the  connectivity  level  {DA)  is  almost  the  same  for  the  (13,40)  and 
US-WAN  networks,  the  former  requires  less  spare  bandwidth  due  to  its  connection  regularity. 
Unlike  the  end-to-end  restoration,  however,  connection  regularity  is  not  a  single  key  factor  but 
connectivity  also  plays  an  important  role  in  determining  the  necessary  spare  capacity  cost.  For 
example,  the  (24,60)  network  requires  a  considerable  amount  of  spare  bandwidth  due  to  its  sparse¬ 
ness  even  though  it  is  well-balanced.  Since  rerouting  is  executed  between  the  two  end-nodes  of  an 
affected  link,  all  restoration  flow  must  be  sent  over  the  unaffected  arcs  adjacent  to  the  nodes.  How¬ 
ever,  the  number  of  such  arcs  is  very  small  for  a  sparse  network.  Consequently,  a  larger  amount  of 
spare  capacity  must  be  installed  over  the  arcs.  Furthermore,  backhauling  happens  with  a  higher 
possibility  for  a  network  with  a  low  connectivity.  A  VP  relayed  at  a  node  adjacent  to  a  failed  link  is 
more  likely  to  be  rerouted  back  to  the  same  link  it  traversed.  This  assertion  is  verified  in  Figure  3- 
11.  In  summary,  both  connection  regularity  and  connectivity  are  the  key  factors  in  determining  the 
necessary  spare  bandwidth  for  fully  restorable  networks  based  on  line  restoration. 
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Figure  3-11.  Topological  effect  2.  Backhauling  and  connectivity 


The  figure  shows  the  relationship  between  the  backhauling  and  looping  effect  (BL)  and  connectivity 
(DA).  Two  demand  patterns,  UF  and  WT,  are  employed  for  each  sample  network.  The  line  in  the  figures  is 
the  least-square  line  over  all  samples. 
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Now  we  address  the  following  question  posed  before:  What  kind  of  network  topology  gets  more 
economical  benefit  from  the  end-to-end  restoration  scheme?  The  cost  savings  obtained  by  end-to- 
end  restoration  over  line  restoration  are  summarized  in  Table  3-3.  The  savings  rate  ranges  widely 
from  3  to  55  percent,  suggesting  that  network  topology  is  a  cmcial  factor  for  economical  gain  from 
end-to-end  restoration.  In  general,  large  cost  savings  are  expected  for  balanced  sparse  networks,  as 
in  the  (24,60)  network.  This  is  because  regularity  gets  a  significant  advantage  from  end-to-end  res¬ 
toration,  and  sparsity  places  a  considerable  penalty  on  line  restoration.  On  the  other  hand,  such  a 
disadvantage  posed  on  line  restoration  may  fade  away  for  well-connected  networks.  Furthermore, 
end-to-end  restoration  may  not  be  able  to  reap  a  large  gain  for  an  unbalanced  network.  Conse¬ 
quently,  the  cost  savings  may  be  small  for  unbalanced  and/or  well-connected  networks  This 
accounts  for  the  fact  that  only  nominal  savings  can  be  obtained  in  the  NJ-LATA  (unbalanced  and 
well  connected)  network,  and  small  savings  can  be  obtained  for  the  ARPA  (unbalanced)  network. 
In  such  networks,  it  may  not  be  advisable  to  employ  the  end-to-end  restoration  scheme,  consider¬ 
ing  its  complicated  decision  making  process  on  the  alternate  routes. 

Finally,  the  cost  savings  with  joint  optimization  are  summarized  in  Table  3-4.  Figure  3-12  com- 


Table  3-3.  Cost  savings  of  ETE-JOA  over  LINE-JOA® 


Network  model 

Unbalanced  Networks 

Balanced  Networks 

Well-connected  < - >  Sparse 

Well-connected  < - >  Sparse 

NJ- 

LATA 

LATA 

US- 

WAN 

ARPA 

(12,50) 

(16,56) 

(13,40) 

(24,60) 

52.86 

43.93 

51.86 

78.69 

25.31 

30.76 

39.86 

37.80 

Cost  (LINE) 

54.45 

51.60 

67.65 

83.71 

35.52 

39.07 

50.44 

83.07 

Savings  rate‘s 

2.93 

14.87 

23.35 

6.00 

28.74 

21.27 

20.98 

54.50 

1.59 

7.67 

15.79 

5.02 

10.21 

8.31 

10.58 

45.27 

a.  UF  demand  pattern  is  used  in  the  experiment. 

b.  (NO 

c.  {LINE-ETE}  /  {LINE}  (percent) 

d.  {LINE-ETE}  (NQ 


16.  This  is  the  contrapositive  of  the  previous  assertion  that  “large  cost  savings  can  be  attained  for  balanced  and  sparse 
networks”.  The  contrapositive  of  the  proposition  ‘p—^q’ is  given  by  ‘—q  ->  -ip  ’.  The  contrapositive  is  true  if  the  orig¬ 
inal  proposition  is  true.  Now,  the  contrapositive  of  the  above  assertion  can  be  stated  as  “if  there  is  a  network  with  nomi¬ 
nal  cost  savings,  then  it  happens  for  unbalanced  and/or  well-connected  networks”.  The  results  of  the  NJ-LATA  sample 
network  suggest  the  existence  of  such  a  network. 
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pares  the  normalized  spare  capacity  cost  of  the  JOA  and  OCA  schemes.  As  discussed  before,  the 
comparative  study  with  the  OCA  scheme  elucidates  the  advantage  of  the  joint  optimization.  The 
result  indicates  that  on  average  over  10  percent  of  the  cost  can  be  saved  by  the  JOA  over  the  OCA. 
The  cost  savings  ratio  ranges  from  6  to  22  percent.  A  less  drastic  dependency  on  network  topology 
is  observed  compared  with  the  savings  of  the  ETE  over  the  LINE. 

3.5.  Summary 

Two  main  results  in  this  section  are  the  development  of  joint  optimization  methods  for  the  SCFA 
problem  and  a  comparative  analysis  of  the  restoration  schemes  with  respect  to  required  spare 
capacity  cost.  Previously,  no  work  has  been  reported  on  a  joint  optimization  approach  for  self- 
healing  networks.  Since  the  optimization  is  conducted  jointly  over  capacity  assignment,  commod¬ 
ity  flow  assignment  and  restoration  flow  assignment,  however,  the  complexity  of  the  problem 
grows  tremendously.  In  order  to  make  the  problem  computationally  tractable,  several  mechanisms 
are  developed.  LU  factorization  of  the  basis  matrix  is  facilitated  by  exploiting  its  special  stracture 
with  the  low  density  of  non-zero  elements.  A  direct  LU  update  technique  is  proposed  to  further 
economize  the  computation.  A  row  generation  and  deletion  mechanism  is  equipped  to  cope  with 
the  explosive  number  of  constraints  in  the  SCFA-MF-ETE  problem.  The  economical  benefit  of  the 
end-to-end  restoration  scheme  is  thoroughly  examined  with  respect  to  the  globally  optimal  capac¬ 
ity  installation  cost.  The  degree  of  sharing  in  spare  capacity  (DSSC)  appears  to  be  a  key  determin- 


Table  3-4.  Cost  savings  of  JOA  over  OCA 


Network  Model 

NJ- 

LATA 

LATA 

ARPA 

US- 

WAN 

(12,50) 

(13,40) 

(16,56) 

(24,60) 

Average 

a 

H 

fa 

Cost  (JOA)® 

52.86 

43.93 

78.69 

51.86 

25.31 

39.86 

30.76 

37.80 

Cost  (OCA) 

56.46 

52.39 

84.18 

56.49 

28.66 

43.26 

34.00 

44.60 

S 

Savings  rate’’ 

6.39 

16.16 

6.53 

8.19 

11.69 

7.87 

9.54 

15.25 

10.20 

fa 

z 

Cost  (JOA) 

54.45 

51.60 

83.71 

67.65 

35.52 

50.44 

39.07 

83.07 

• 

Cost  (OCA) 

58.96 

63.44 

107.4 

75.30 

39.28 

60.65 

48.17 

87.37 

Savings  rate 

7.65 

18.66 

22.04 

10.15 

9.57 

16.85 

18.89 

4.92 

13.59 

a.  Required  spare  capacity  cost  in  terms  of  NC  (Normalized  cost) 

Cost  is  normalized  by  the  minimum  average  link  capacity  satisfying  the  demand. 

b.  {OCA-JOA}  /  {OCA}  (percent) 

The  UF  demand  pattern  is  employed  in  the  experiment. 
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Figure  3-12.  Required  spare  capacity  cost:  JOA  vs.  OCA 


ing  factor  of  spare  capacity  cost,  while  the  backhauling  and  looping  effects  also  contribute  to  it  for 
line  restoration-based  networks.  The  result  suggests  that  a  balanced  sparse  network  can  gain  a 
remarkable  benefit  from  end-to-end  restoration.  Contrary  to  a  wide  belief  in  the  economic  advan¬ 
tage  of  the  end-to-end  restoration  scheme,  the  analysis  has  revealed  that  the  attainable  gain  could 
be  marginal  for  a  well-connected  and/or  unbalanced  network. 
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IV  Survivable  Virtual  Path  Routing  (SVPR) 


The  SVPR  problem  arises  when  there  is  a  major  change  in  the  underlying  network  environment 
such  as  a  traffic  demand  or  a  facility  network  reconfiguration.  The  VP  planner  performs  dynamic 
VP  reconfiguration  to  maximize  network  survivability  in  a  new  environment.  Three  restoration 
mechanisms  discussed  in  Section  2. 1 .2  are  considered:  MF-LINE,  MF-ETE  and  KSP-LINE.  Due 
to  substantial  differences  in  their  restoration  mechanisms,  their  optimization  models  and  solution 
procedures  are  developed  independently.  The  SVPR  problems  for  MF-LINE  or  MF-ETE  networks 
are  modeled  as  a  large-scale  LP  problem  as  in  the  SCFA  problem,  and  similar  strategies  are  used  to 
solve  these  problems.  On  the  other  hand,  the  SVPR  problem  for  KSP-LINE  networks  is  modeled 
as  a  nonlinear,  non-smooth  multicommodity  flow  problem  with  linear  constraints.  New  algorithms 
are  developed  to  find  a  near-optimal  solution.  The  effectiveness  of  the  proposed  dynamic  VP  rout¬ 
ing  scheme  is  examined  through  numerical  experiments.  A  comparative  analysis  of  the  three  resto¬ 
ration  schemes  clarifies  their  pros  and  cons  with  respect  to  network  restorability.  This  section 
further  demonstrates  the  efficiency  of  the  two-step  restoration  scheme  proposed  in  Section  11. 

4.0.  Introduction 

The  SVPR  minimizes  an  expected  amount  of  lost  flow  upon  restoration  from  a  network  failure. 
It  aims  to  enhance  the  survivability  of  the  network  in  response  to  a  change  in  the  network  environ¬ 
ment.  Previous  works  on  fast  restoration  schemes  have  overlooked  the  need  of  dynamic  VP  recon¬ 
figuration  and  have  attempted  to  meet  a  restorability  objective  through  the  development  of 
restoration  protocols.  They  assume  that  a  flow  assignment  is  static.  However,  a  static  virtual  path 
routing  approach  not  only  degrades  the  performance  of  a  fast  restoration  protocol,  but  it  also 
requires  more  spare  capacity  to  achieve  the  same  level  of  survivability  as  in  a  dynamically  recon- 
figurable  network.  Therefore,  a  dynamic  VP  reconfiguration  scheme,  combined  with  an  optimal 
network  design  scheme  proposed  in  the  previous  section,  can  realize  a  self-healing  ATM  network 
efficiently  and  in  a  cost-effective  manner. 
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The  rest  of  this  section  is  organized  as  follows.  Section  4.1  gives  the  formal  definition  of  the 
SVPR  problem.  The  succeeding  three  sections  discuss  the  details  of  the  SVPR  problems  based  on 
three  different  restoration  schemes.  The  problem  formulation  and  solution  approach  are  discussed 
in  the  respective  sections.  Finally,  results  from  numerical  experiments  are  reported  in  Section  4.5. 

4.1.  Formal  Problem  Definition 

Using  the  same  notations^  and  assumptions  introduced  in  Section  3.1,  the  SVPR  problem  can  be 
formally  stated  as  the  following  optimization  problem: 


Given  G  =  (V,  A,  c)  and  Q,  (SVPR) 

Minimize  An  expected  lost  flow  L  (f ) 
over  f , r 

subject  to  a)  f  is  a  multicommodity  flow  satisfying  Q,  (flow  conservation  law) 

b)  capacity  constraints  are  satisfied,  and  (capacity  constraints) 

c)  restoration  flow  r  obeys  the  fast  restoration  protocol 

(restoration  flow  constraints) 


As  in  the  SCFA  problem,  three  fast  restoration  schemes  lead  to  three  distinct  SVPR  problems. 
They  are  (1)  MF  rerouting  with  line  restoration,  (2)  MF  rerouting  with  end-to-end  restoration  and 
(3)  KSP  rerouting  with  line  restoration.  In  the  following,  the  corresponding  SVPR  problems  are 
referred  to  as  the  SVPR-MF-LINE,  SVPR-MF-ETE  and  SVPR-KSP-LINE  problems,  respec¬ 
tively. 


4.2.  SVPR-MF-LINE 
4.2.1.  Problem  formulation 

The  intricate  math  is  shown  in  Appendix  H  in  order  to  preserve  the  flow  of  the  document. 


1.  Also  refer  to  Appendix  T. 
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4.2.2.  Solution  approach 

The  intricate  math  is  shown  in  Appendix  I  in  order  to  preserve  the  flow  of  the  document. 

4.2.3.  Validity  of  the  algorithm 

The  intricate  math  is  shown  in  Appendix  J  in  order  to  preserve  the  flow  of  the  document. 

4.2.4.  Implementation  issue 

As  discussed  in  Section  2.2.4,  the  speed  of  the  VP  planner  module  determines  a  feasible  range  of 
an  update  interval  of  VP  configuration.  The  VP  planner  should  output  the  result  as  quickly  as  pos¬ 
sible,  say  in  a  period  one  or  two  orders  of  magnitude  less  than  the  VP  update  interval.  In  order  to 
clarify  the  applicability  of  the  proposed  algorithm,  this  section  examines  the  required  CPU  time 
for  four  sample  networks.  The  experiments  are  conducted  on  the  DEC  Alpha-station  200  4/233. 
The  uniform  traffic  (UF)  demand  and  its  jointly  optimal  capacity  placement  are  utilized  in  the 
experiments.  The  traffic  demand  projected  at  the  design  phase  of  the  capacity  assignment  is  called 
a  base  traffic  demand  in  this  project.  A  relative  network  load  (RNL)  is  used  as  a  metric  of  the  net¬ 
work  load.  It  is  defined  to  be  100%  if  the  offered  traffic  load  is  the  same  as  the  base  traffic  demand. 

The  CPU  time  is  measured  at  100%  RNL  as  well  as  102.5%  RNL.  Physical  resource  allocation 
is  designed  based  on  demands  of  the  busy  hour  traffic  projected  over  some  future  time  frame. 
Thus,  the  demand  with  100%  or  less  RNL  represents  the  case  when  the  network  is  in  a  normal 
operating  condition.  On  the  other  hand,  102.5%  RNL  represents  the  case  when  there  is  an  occa¬ 
sional  surge  of  traffic  demand  beyond  the  projected  level. 

For  the  latter  case,  the  base  traffic  demand  is  uniformly  increased  by  2.5  percent.  At  100%  RNL, 
on  the  other  hand,  30  randomly  generated  traffic  demands  are  utilized^.  The  end-to-end  traffic 
requirements  of  each  random  input  deviate  at  most  10  percent  from  those  of  the  base  traffic 
demand.  They  are  uniformly  distributed  over  this  range,  while  the  total  amount  of  traffic  demand 
is  fixed.  A  sample  mean  over  30  observations  is  calculated  and  reported  in  Table  4-1.  The  results  at 
102.5%  RNL  are  summarized  in  Table  4-2.  The  results  (the  first  row  of  the  tables)  indicate  that  the 


2.  Note  that  if  the  demand  is  identical  to  the  base  traffic  demand  the  solution  is  instantly  obtainable  since  it  has  been 
already  calculated  at  the  design  phase.  Thus,  we  exclude  this  case  from  the  experiment. 
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solution  procedure  typically  completes  within  a  few  seconds  to  a  few  minutes.  This  implies  that 
the  proposed  algorithm  would  be  applicable  to  a  network  with  rapidly  changing  demand  patterns 
in  most  cases. 

Further  speed-up  can  be  attained  using  the  following  observations:  First  of  all,  observe  that  only 
a  few  commodity  path  and  restoration  path  variables  are  generated  at  the  beginning  of  the  solution 
procedure.  The  master  process  generates  at  most  one  candidate  commodity  path  variable  per  com¬ 
modity  and  one  restoration  path  variable  per  arc.  If  several  commodity  and  restoration  path  vari¬ 
ables  are  required  for  each  commodity  and  each  arc,  which  is  the  case  at  a  higher  load,  then  the 
solution  procedure  must  iterate  between  the  master  process  and  the  sub-process  many  times  before 
reaching  a  final  solution.  If  most  of  the  candidate  paths  are  known  from  the  beginning,  then  unnec¬ 
essary  iterations  between  the  two  processes  can  be  avoided.  Furthermore,  since  we  have  more 
plausible  candidate  paths  at  hand,  the  objective  function  value  could  advance  further  at  each  itera¬ 
tion  of  the  sub-process.  Thus,  we  expect  to  obtain  a  final  solution  more  quickly. 

Candidate  paths  can  be  collected  by  accumulating  paths  which  have  been  employed  over  time. 
The  k  shortest  paths  could  also  be  included  in  the  candidate  list.  The  more  candidates  we  have,  the 

Table  4-1.  CPU  time  for  SVPR-MF-LINE  1.  30  random  demands  at  100%  RNL  . 


Network  Model 

NJ-LATA 

(12,50) 

(24,60) 

US-WAN 

Test  Item 

Speed^ 

Loss*’ 

Speed 

Loss 

Speed 

Loss 

Speed 

Loss 

MF-LINE 

nog 

13.5  s 

0.1248 

MF-LINEw/RP' 

9.71  s 

0.1248 

a.  CPU  time  measured  in  seconds  (s)  or  minutes  (m). 

b.  An  expected  lost  flow  averaged  over  30  samples,  in  terms  of  a  normalized  lost  flow  (NL). 

c.  The  SVPR-MF-LINE  with  an  RP  option.  (See  next  page.)  Kl=10  and  K2=30. 


Table  4-2.  CPU  time  for  SVPR-MF-LINE  2.  UF  demand  at  102.5%  RNL 


Network  Model 

NJ-LATA 

(12,50) 

(24,60) 

_ 

US-WAN 

Test  Item 

Speed 

Loss 

Speed 

Loss 

Speed 

Loss 

Speed 

Loss 

MF-LINE 

mg 

3.075 

2m  23s 

3.740 

2.643 

14m  4s 

MF-LINE  w/RP“ 

3.075 

2m  7s 

3.740 

2.643 

5m  11s 

a.  Kl=10  and  K2=30. 
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fewer  iterations  are  required.  However,  this  could  prolong  the  pricing-out  procedure,  and  thus  an 
excessive  generation  of  the  candidates  should  be  avoided.  The  following  approach  is  applied  in 
our  numerical  experiments. 

•  Collect  information  on  the  commodity  and  restoration  path  variables  used  in  optimal  solu¬ 
tions  at  100%,  102.5%  and  105%  RNL. 

•  Generate  all  commodity  variables  employed  in  the  above  solutions  from  the  beginning^. 

•  For  each  restoration  path,  calculate  the  sum  of  carried  flows  in  the  above  solutions.  Make  an 
ordered  list  of  the  restoration  path  variables  per  each  arc.  List  the  paths  with  a  higher  resto¬ 
ration  flow  first. 

•  Append  the  k  shortest  restoration  paths  at  the  end  of  the  ordered  list  if  they  are  not  included 
yet.  List  the  paths  with  a  shorter  length  first. 

•  Generate  K1  restoration  path  variables  per  arc  from  the  top  of  the  list  at  the  beginning  of  the 
procedure. 

•  Generate  more  than  K1  but  at  most  K2  restoration  path  variables  per  arc  if  such  path  vari¬ 
ables  carry  any  flow  at  an  optimum. 

•  Restrict  the  restoration  paths  only  to  the  generated  ones. 

We  refer  to  the  SVPR-MF-LINE  problem  with  a  restriction  on  restoration  paths  as  the  SVPR-MF- 
LINE problem  with  an  RP  option.  Due  to  the  restriction,  only  the  dual  feasibility  conditions  related 
to  commodity  path  variables  (Equation  (H-4))  must  be  checked  at  the  master  process. 

The  results  are  shown  in  the  last  row  of  Table  4-1  and  Table  4-2.  K1  and  K2  are  set  to  10  and  30, 
respectively.  Restriction  on  the  restoration  paths  could  degrade  the  attainable  survivability  mea¬ 
sure.  Thus,  an  optimal  expected  lost  flow  achieved  by  each  scheme  is  also  listed  in  the  tables  for 
comparative  purposes.  The  amount  of  lost  flow  is  normalized  with  respect  to  the  average  link 
capacity.  NL  (normalized  loss)  is  defined  as  a  unit  of  a  normalized  lost  flow  where  100  NL  is  equal 
to  the  average  link  capacity.  The  result  indicates  that  we  could  double  or  even  triple  the  speed  of 
computation  with  an  RP  option.  The  degradation  of  attainable  survivability  is  not  observed  in  this 
experiment,  although  a  small  degradation  appears  if  K1  and  K2  are  set  down  at  5  and  15,  respec¬ 
tively. 


3.  Based  on  our  experiments,  few  commodity  paths  are  used  at  an  optimum.  Thus,  the  number  of  generated  path  vari¬ 
ables  is  not  excessive. 
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4.3.  SVPR-MF-ETE 
4.3.1.  Problem  formulation 


The  intricate  math  is  shown  in  Appendix  K  in  order  to  preserve  the  flow  of  the  document. 

4.3.2.  Solution  approach 

The  intricate  math  is  shown  in  Appendix  L  in  order  to  preserve  the  flow  of  the  document. 

4.3.3.  Validity  of  the  algorithm 

The  intricate  math  is  shown  in  Appendix  M  in  order  to  preserve  the  flow  of  the  document. 


4.3.4.  Implementation  issues 

Due  to  the  increased  complexity  and  size  of  the  problem,  the  SVPR-MF-ETE  problem  is 
expected  to  require  longer  computation  time  than  the  SVPR-MF-LINE  problem.  As  before,  the 
required  CPU  time  is  measured  for  the  four  sample  networks  to  see  the  applicability  of  the  solution 
procedure.  The  experiments  are  conducted  with  the  same  conditions  as  in  Section  4.2.4.  The 
results  are  summarized  in  Tables  4-3  and  4-4. 

The  results  (the  first  row  of  the  tables)  indicate  that  except  for  a  small  network  it  takes  20  to  35 
minutes  to  complete  the  process  under  normal  operating  conditions.  Thus,  end-to-end  restoration 
might  not  be  applicable  to  a  network  where  a  VP  reconfiguration  procedure  must  be  invoked  very 
frequently.  When  the  network  load  goes  up,  the  required  CPU  time  grows  considerably,  suggesting 
that  its  applicability  is  largely  restricted  at  this  range  of  the  network  load. 

As  in  the  case  of  the  SVPR-MF-LINE  problem,  generating  and  restricting  candidate  paths  could 
reduce  the  computation  time.  Unlike  the  previous  case,  the  restoration  flow  must  be  obtained  for 
all  combinations  of  commodities  and  failure  events.  This  requires  a  significantly  large  number  of 
restoration  path  flow  variables.  Generating  many  candidate  paths,  however,  could  considerably 
prolong  the  pricing-out  operation.  To  cope  with  the  problem,  the  following  observations  are  uti¬ 
lized:  First  of  all,  since  only  a  fraction  of  (n,  1)  constraints  becomes  active  at  a  time,  the  candidate 
restoration  paths  can  be  generated  only  when  the  related  constraint  is  active.  Secondly,  it  is  possi- 
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ble  to  decrease  the  number  of  candidate  restoration  paths  corresponding  to  each  (7t,  /)  constraint.  In 
the  case  of  line  restoration,  arc  restoration  paths  are  shared  among  multiple  commodities.  On  the 
other  hand,  (Jt,  /)  restoration  paths  are  only  prepared  for  commodity  n,  and  thus  not  so  many  paths 
are  needed.  Finally,  a  restriction  is  also  placed  on  the  commodity  paths  in  order  to  further  reduce 
the  computation  time. 

In  the  experiments,  the  candidate  paths  are  collected  in  the  following  manner: 

•  Collect  information  on  the  path  variables  used  in  the  optimal  solutions  at  100%,  102.5%  and 
105%  RNL.  Also,  gather  information  on  the  path  variables  which  have  been  generated  dur¬ 
ing  the  course  of  the  iterations,  but  stay  out  of  the  basis  at  an  optimum. 

•  For  each  commodity  and  restoration  path,  calculate  the  sum  of  carried  flow  in  the  above 
solutions. 


Table  4-3.  CPU  time  for  SVPR-MF-ETE  1. 30  random  demands  at  100%  RNL 


Network  Model 

NJ-LATA 

(12,50) 

US-WAN 

Test  Item 

Speed^ 

Loss*" 

Speed 

Loss 

Speed 

Loss 

MF-ETE 

9.65  s 

1.206 

20m  28s 

1.811 

21m  46s 

0.5001 

34m  27s 

0.3076 

MF-ETEw/PRP' 

8.85  s 

1.295 

16m  53s 

1.939 

19m  33s 

0.5053 

32m  18s 

0.3112 

MF-LINE 

1.04  s 

0.6109 

Im  13s 

0.2454 

13.5  s 

0.1324 

8m  31s 

0.1248 

a.  CPU  time  measured  in  seconds  (s)  or  minutes  (m). 

b.  An  expected  lost  flow  averaged  over  30  samples,  in  terms  of  a  normalized  lost  flow  (NL). 

c.  SVPR-MF-ETE  scheme  with  commodity  path  and  restoration  path  restriction.  Pl=l,  P2=2,  RP1=2 
and  RP2=6.  This  choice  requires  the  least  CPU  time  based  on  our  experiments. 


Table  4-4.  CPU  time  for  SVPR-MF-ETE  2.  UF  demand  at  102.5%  RNL 


Network  Model 

NJ-LATA 

(12,50) 

(24,60) 

US-WAN 

Test  Item 

Speed® 

Loss 

Speed 

Loss 

Speed 

Loss 

Speed 

Loss 

MF-ETE 

41.3  s 

4.812 

53m  15s 

7.898 

2h36m 

5.973 

14h  29m 

4.002 

16.5  s 

4.812 

22m  3s 

27m  2s 

5.973 

Ih  41m 

4.002 

MF-LINE 

1.22  s 

3.075 

2m  23s 

3.740 

10.6  s 

1.560 

MF-LINE  w/RP* 

1.17  s 

3.075 

2m  7s 

3.740 

6.45  s 

1.560 

a.  CPU  time  measured  in  seconds  (s),  minutes  (m)  or  hour  (h). 

b.  Pl=l,  P2=2,  RP1=2  and  RP2=6. 

c.  Kl=10  and  K2=30. 
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•  Make  an  ordered  list  of  the  commodity  path  variables  per  each  commodity.  List  the  paths 
with  a  higher  traffic  flow  first.  Similarly,  make  an  ordered  list  of  the  restoration  path  vari¬ 
ables  per  each  combination  of  7t  and  /. 

•  Generate  top  PI  commodity  path  variables  per  commodity  from  the  list.  Generate  more  than 
PI  but  at  most  P2  variables  if  such  paths  have  carried  some  traffic  in  the  optimal  solutions. 

•  When  a  (7t,  /)  constraint  becomes  active,  generate  top  RPl  (tt,  Z)  restoration  path  variables 
from  the  ordered  list.  Generate  more  than  RPl  but  at  most  RP2  variables  if  such  paths  have 
carried  some  restoration  traffic  in  the  optimal  solutions. 

The  results  are  reported  in  the  second  rows  of  Tables  4-3  and  4-4.  The  above  mechanism 
improves  the  speed  of  the  solution  procedure  at  102.5%  RNL.  However,  the  required  time  may  not 
be  small  enough  for  a  network  with  rapidly  changing  demand  patterns.  Although  such  a  high  load 
would  seldom  be  encountered  in  a  normal  situation,  it  could  happen  after  a  network  failure.  On 
such  an  occasion,  it  would  be  desirable  to  perform  network-wide  restoration  at  an  earlier  time.  One 
possible  approach  is  to  perform  reconfiguration  gradually.  Since  a  primal  simplex  method  is 
employed  in  the  solution  procedure,  any  intermediate  solution  is  feasible.  Therefore,  a  currently 
available  solution  could  be  gradually  applied  over  the  course  of  calculation  of  the  optimal  solu¬ 
tion. 

At  100%  RNL,  the  survivability  measure  experiences  small  degradation.  Since  only  a  small 
number  of  commodity  and  restoration  paths  are  generated  per  each  commodity  and  each  arc,  they 
could  deviate  from  the  paths  actually  required  for  the  randomly  generated  demands.  Furthermore, 
no  drastic  CPU  time  improvement  has  been  observed  at  100%  RNL.  One  possible  reason  is  that 
the  number  of  constraints  is  too  large  to  reap  the  benefit  of  candidate  path  generation.  It  has  been 
empirically  observed  that  the  number  of  simplex  iterations  grows  considerably  with  the  number  of 
constraints  and  only  very  slowly  with  the  number  of  variables  [17].  Thus,  the  benefit  of  the  vari¬ 
able  restriction  could  be  cancelled  by  the  explosive  number  of  constraints.  Based  on  the  experi¬ 
ment,  an  almost  identical  number  of  pivoting  operations  is  required.  As  a  result,  tens  of  minutes  of 
computational  time  are  still  required  for  most  networks.  This  implies  that  the  applicability  of  end- 
to-end  restoration  is  somewhat  restricted  compared  to  line  restoration. 

4.4.  SVPR-KSP-LINE 

The  SVPR-KSP-LINE  problem  addresses  the  virtual  path  routing  problem  assuming  the  KSP- 
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based  line  restoration  as  a  self-healing  protocol.  In  general,  the  protocol  works  as  follows:  When  a 
link  failure  happens,  the  protocol  first  searches  for  restoration  paths  through  message-flooding. 
Every  time  such  a  path  becomes  available,  a  part  of  the  affected  virtual  paths  are  restored  using  the 
maximum  bandwidth  found  over  the  path;  that  is,  the  self-healing  algorithm  first  recovers  a  part  of 
the  affected  virtual  paths  by  using  the  entire  spare  bandwidth  available  on  the  shortest  restoration 
route.  A  subset  of  the  remaining  affected  virtual  paths  are  restored  over  the  second-shortest  route, 
and  further  remainders  attempt  to  use  the  third-shortest  route.  This  procedure  is  repeated  until 
enough  bandwiuth  is  found  to  restore  all  affected  VP’s  or  until  all  possible  restoration  routes  are 
exhausted.  Due  to  the  substantial  difference  in  the  restoration  process,  the  problem  formulation 
significantly  differs  in  terms  of  the  restoration  flow  constraints.  The  problem  turns  out  to  be  non- 
convex,  which  makes  it  difficult  to  find  a  globally  optimal  solution.  Two  heuristic  procedures  are 
developed  based  on  two  different  problem  formulations. 

4.4.1.  Problem  formulation 

The  intricate  math  is  shown  in  Appendix  N  in  order  to  preserve  the  flow  of  the  document. 

4.4.2.  Solution  approach  1.  Modified  flow  deviation  (MFD)  method 

The  intricate  math  is  shown  in  Appendix  O  in  order  to  preserve  the  flow  of  the  document. 

4.4.3.  Solution  approach  2.  Lagrangian  relaxation  method 

The  intricate  math  is  shown  in  Appendix  P  in  order  to  preserve  the  flow  of  the  document. 

4.4.4.  Comparison  of  two  heuristics  for  the  SVPR-KSP-LINE  problem 

This  section  evaluates  the  performance  of  the  two  heuristics  for  the  SVPR-KSP-LINE  problem 
proposed  in  the  preceding  two  sections,  the  MFD  method  and  the  LAG  method.  Comparative  anal¬ 
ysis  of  their  attainable  expected  lost  flow  is  performed  based  on  four  sample  networks.  The  uni¬ 
form  traffic  (UF)  demand  and  its  jointly  optimal  capacity  assignment  are  utilized  in  the 
experiments.  The  base  traffic  demand  from  100%  to  120%  RNL  is  examined.  Figure  4-1  depicts 
the  result  of  the  experiments. 

The  result  shows  that  the  LAG  method  gives  a  better  solution  for  all  the  cases  we  tested.  As 
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(a)  NJ-LATA  Sample  Network 


(b)  (12,50)  Sample  Network 


(c)  (24,60)  Sample  Network  (d)  US-WAN  Sample  Network 

Figure  4-1.  Solution  to  the  SVPR-KSP-LINE  problem  with  two  heuristics 

In  addition  to  the  solutions  based  on  the  two  heuristics,  the  lower  bound  based  on  the 
constraint  relaxation  technique  is  plotted  along  the  dotted  line.  The  relaxed  problem 
corresponds  to  the  SVPR-MF-LINE  problem  with  an  RP  option.  The  minimum  value  of  this 
relaxed  problem  produces  the  absolute  lower  bound  of  the  solution  to  the  SVPR-KSP-LINE 
problem. 
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expected,  the  LAG  method  can  save  more  flow  as  the  load  grows,  up  to  2.2  NL  at  120%  RNL. 
However,  the  difference  is  nominal  when  the  load  is  light,  typically  less  than  0.5  NL  at  102.5% 
and  105%  RNL. 

Figure  4-1  also  plots  the  lower  bound  of  the  optimal  solution  along  a  dotted  line.  This  lower 
bound  is  calculated  with  a  constraint  relaxation  technique  (see  Appendix  S.l),  where  the  KSP  con¬ 
straints  (N-10)  are  completely  removed  from  the  problem  (SVPR-KSP-LINE2).  Then,  the  result¬ 
ing  minimization  problem  becomes  the  SVPR-MF-LINE  problem  with  an  RP  option.  Unlike  the 
Lagrangian  subproblem  (SVPR-KSP-LINE-LAG),  a  global  minimum  is  obtainable  to  this  relax¬ 
ation  problem'^.  From  the  property  (S-3)  in  Appendix  S.l,  this  minimum  value  is  guaranteed  to  be 
a  lower  bound  of  the  SVPR-KSP-LINE  problem.  However,  the  bound  could  be  loose  since  the 
KSP  constraints  are  not  imposed  at  all.  Table  4-5  compares  the  lower  bound  and  the  solution 
(upper  bound)  based  on  the  LAG  method. 

The  upper  bound  is  very  close  to  the  lower  bound  for  the  NJ-LATA  and  (24,60)  networks.  The 
difference  is  typically  less  than  0.5  NL,  and  the  gap  is  mostly  less  than  5  percent.  This  indicates 
that  not  only  is  the  solution  near-optimal,  but  also  the  KSP  and  MF  schemes  do  not  make  a  signif¬ 
icant  difference  in  terms  of  their  attainable  survivability  measure.  On  the  other  hand,  a  wider  gap 
between  the  upper  and  lower  bounds  is  observed  for  the  US-WAN  network.  The  gap  becomes  con¬ 
siderably  wider  for  the  (12,50)  network;  It  grows  as  much  as  7.8  NL.  This  could  be  due  to  a  large 
difference  between  the  attainable  survivability  measures  of  the  KSP  and  MF  schemes.  The  MF 
scheme  might  be  able  to  restore  considerably  more  flow  for  a  large  and/or  well-connected  net¬ 
work.  The  scheme  could  distribute  a  restoration  flow  more  wisely  since  more  candidate  restoration 
paths  are  available  in  such  networks. 

In  order  to  verify  this  hypothesis,  we  have  performed  an  intensive  random  search  based  on  the 
SVPR-KSP-LINE-LAG-global  procedure  where  the  KSP  constraints  are  also  taken  into  account. 
Figure  4-2  shows  local  minima  obtained  throughout  the  SVPR-KSP-LINE-LAG-global  procedure. 
One  thousand  random  samples  are  checked  at  105%  RNL  for  the  (12,50)  network.  As  discussed  in 
Appendix  S,  a  Lagrangian  multiplier  largely  determines  how  close  the  obtained  lower  bound  is  to 
an  optimum.  The  Lagrangian  relaxation  technique  iteratively  refines  a  Lagrangian  multiplier  to 
find  a  tighter  lower  bound.  In  the  experiment,  two  distinct  Lagrangian  multipliers  are  employed 

4.  This  is  because  the  SVPR-MF-LINE  problem  with  an  RP  option  is  a  linear  programming  problem. 
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which  have  been  encountered  at  a  later  stage  of  the  LAG  method.  The  result  (b)  uses  the  multiplier 
at  the  final  iteration  of  the  LAG  method,  and  thus  a  tighter  lower  bound  is  expected. 

The  results  suggest  the  validity  of  the  above  hypothesis:  The  lower  bound  estimate  resides  just 
below  the  upper  bound.  The  gap  is  only  0.24  NL  or  5.7  percent.  On  the  other  hand,  the  estimate 
lies  3.24  NL  above  the  absolute  lower  bound,  yielding  a  more  than  13  times  wider  gap.  The  same 
experiment  is  also  conducted  at  112.5%  RNL,  and  the  result  further  confirms  the  hypothesis  (Fig¬ 
ure  4-3). 

Finally,  the  computation  time  for  the  LAG  method  is  significantly  longer  than  that  of  the  MFD 
method.  The  LAG  method  requires  as  much  as  tens  of  minutes  to  an  hour,  making  it  impractical  to 
apply  the  method  to  a  network  with  rapidly  changing  demand  patterns.  The  MFD  method,  on  the 


Table  4-5.  Upper  and  lower  bounds  for  the  SVPR-KSP-LINE  problem 


Network 

Model 

Load 

(RNL) 

UBD® 

(NL) 

lbd'’ 

(NL) 

Difference® 

(NL) 

Tolerance*^ 

(%) 

1.890 

1.812 

in 

4.986 

4.434 

NJ-LATA 

4.01 

12.256 

11.783 

mmm 

m 

30.120 

29.742 

0.378 

1.27 

0.030 

0.334 

1118 

(12,50) 

■n 

■M 

1.046 

3.482 

333 

16.707 

8.916 

7.790 

87.4 

m 

48.182 

43.562 

4.620 

10.6 

0.2555 

0.2554 

0.0001 

(24,60) 

wgi 

0.796 

0.796 

0.0000 

3.495 

3.430 

0.065 

mm 

m 

13.086 

12.313 

0.537 

0.537 

0.0000 

0.00 

HI 

2.085 

1.384 

0.701 

50.6 

US-WAN 

6.685 

4.805 

1.880 

39.1 

m 

16.748 

15.395 

1.353 

8.79 

a.  Upper  bound.  Solution  obtained  by  the  LAG  method. 

b.  Lower  bound.  Optimal  solution  based  on  the  KSP  constraint  relaxation. 

c.  (Upper  bound)  -  (Lower  bound) 

d.  [(Upper  bound)  -  (Lower  bound)]  /  (Lower  bound) 
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Figure  4-2.  Lower  bound  estimate  for  the  SVPR-KSP-LINE  problem  (105%  RNL) 

The  result  is  for  the  (12,50)  sample  network  at  105%  RNL.  The  above  two  results  correspond  to  the 
two  distinct  sets  of  Lagrangian  multipliers.  Result  (b)  uses  the  one  encountered  at  the  final  stage  of 
the  LAG  method,  yielding  a  tighter  lower  bound  estimate.  The  gap  between  the  upper  bound  and  the 
lower  bound  estimate  is  only  0.24  NL  or  5.7  percent,  while  the  difference  between  the  estimate  and 
the  absolute  lower  bound  is  3.24  NL. 
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Expected  Lost  Flow  (NL)  . Expected  Lost  Flow  (NL) 


Samples 

(b) 


Figure  4-3.  Lower  bound  estimate  for  the  SVPR-KSP-LiNE  problem  (112.5%  RNL) 

The  result  is  for  the  (12,50)  sample  network  at  112.5%  RNL.  The  above  two  results  correspond  to  the 
two  distinct  sets  of  Lagrangian  multipliers.  Result  (b)  uses  the  one  encountered  at  the  final  stage  of  the 
LAG  method,  yielding  a  tighter  lower  bound  estimate.  The  gap  between  the  upper  bound  and  the  lower 
bound  estimate  is  only  0.48  NL  or  2.0  percent,  while  the  difference  between  the  estimate  and  the 
absolute  lower  bound  is  7.0  NL. 
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other  hand,  is  very  quick.  Although  the  MFD  method  ends  up  with  a  higher  lost  flow,  it  can  usually 
produce  a  comparable  (near-optimal)  solution  as  long  as  the  load  stays  around  the  projected  level. 

4.5.  Evaluations 

This  section  investigates  the  performance  of  the  proposed  survivable  VP  reconfiguration 
schemes.  First  of  all,  the  attainable  survivability  measure  is  examined  to  see  how  well  the  pro¬ 
posed  optimization  procedures  work  in  response  to  varying  traffic  demands  and  at  various  offered 
network  loads.  The  effectiveness  of  dynamic  VP  reconfiguration  is  then  demonstrated  through 
comparative  analysis  with  a  static  routing  scheme  as  well  as  an  existing  dynamic  routing  scheme. 
The  three  fast  restoration  schemes  are  compared  in  terms  of  restorability  under  the  same  network 
environment.  Finally,  the  efficiency  of  the  proposed  two-step  restoration  approach  is  explored. 

The  eight  sample  networks  which  have  been  scrutinized  upon  evaluation  of  the  SCFA  schemes 
(Figures  3-4  and  3-5)  are  employed  in  the  experiments  of  this  section.  As  discussed  in  Section  I, 
the  physical  link  capacity  assignment  is  one  of  the  key  factors  in  determining  the  performance  of 
survivable  virtual  path  assignment  as  well  as  fast  restoration  protocol.  Thus,  the  evaluation  of  the 
SVPR  schemes  on  an  arbitrarily  assigned  network  resource  would  not  produce  meaningful  results, 
although  such  an  approach  has  often  been  taken  in  previous  works  on  self-healing  networks.  In  our 
experiment,  the  link  capacity  assignment  is  obtained  through  the  SCFA-JOA  procedure  for  a  given 
projected  traffic  demand  (base  traffic  demand).  With  this  design,  full  service  restorability  is 
ensured  against  any  single  link  failure  when  the  basic  traffic  demand  is  offered  to  the  network.  The 
performance  of  the  proposed  SVPR  procedures  is  examined  at  various  network  loads  around  the 
base  traffic  demand,  which  would  arise  in  a  practical  situation.  As  for  the  base  traffic  demand,  the 
uniform  traffic  demand  (UF  demand)  is  utilized  in  this  section  unless  specified  otherwise.  The  fol¬ 
lowing  optimization  parameter  values  are  used  for  the  MFD  procedure:  6=0.01%  and  £=10"^%. 
The  step  size,  h,  is  adjusted  over  the  iterations  between  10'^%  and  0.5%  of  the  average  link  capac¬ 
ity. 

4.5.1.  Attainable  survivability 

This  subsection  investigates  the  attainable  survivability  measure  (an  expected  lost  flow)  to  see 
how  well  the  proposed  SVPR  procedures  work  in  various  offered  loads.  In  the  experiment,  the 
base  traffic  demands  are  uniformly  increased  from  90  to  110  percent  of  the  original  base  traffic 
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requirement  to  examine  the  performance  when  the  load  is  light  as  well  as  when  the  traffic  demand 
grows.  In  the  following  discussion,  this  uniformly  increased  demand  is  called  a  base  demand  of 
the  load.  A  relative  network  load  (RNL)^  is  used  as  a  metric  of  the  network  load.  The  demand  of 
100%  or  less  RNL  represents  the  case  when  the  network  is  in  a  normal  operating  condition,  as  dis¬ 
cussed  before.  On  the  other  hand,  a  load  slightly  more  than  100%  RNL  demonstrates  the  case 
when  the  traffic  demand  grows  beyond  the  projected  level  over  months  or  years,  or  the  case  when 
an  unexpected  surge  of  traffic  demand  arises  somewhere  in  the  network.  A  further  heavy  load  rep¬ 
resents  the  case  when  the  network  contains  some  ill  conditions  such  as  multiple  network  failures 
or  an  unexpected  overshooting  of  demand  across  the  network. 

In  order  to  capture  the  effect  of  a  varying  traffic  demand,  30  demand  patterns  are  examined  at 
each  load;  the  base  demand  of  the  load  and  29  additional  random  traffic  demands^.  The  results 
reported  in  this  subsection  are  a  sample  mean  over  30  observations.  Note  that  even  if  the  load  is 
100%  or  less  RNL,  the  optimal  expected  lost  flow  is  not  necessarily  zero  unless  the  demand  pat¬ 
tern  is  identical  to  that  of  a  base  traffic  demand. 

Figure  4-4  shows  the  attainable  survivability  measure  for  each  restoration  scheme.  The  MFD 
(modified  flow  deviation)  method  is  utilized  for  the  SVPR-KSP-LINE  problem.  The  result  indi¬ 
cates  that  a  lost  flow  increases  very  quickly  as  the  offered  network  load  grows  beyond  the  pro¬ 
jected  demand  level.  On  the  other  hand,  a  loss  is  nominal  and  typically  far  below  1  NL  with  100% 
or  less  RNL,  which  will  be  the  situation  encountered  most  often  in  practice.  This  small  loss  is  due 
to  demand  fluctuation.  Table  4-6  clarifies  the  relation  between  the  expected  lost  flow  and  the  max¬ 
imum  demand  fluctuation.  Survivability  suffers  as  the  demand  fluctuates  more  from  the  base  traf¬ 
fic  load,  but  the  degradation  is  rather  insignificant  compared  with  that  caused  by  a  demand 
increase. 

The  level  of  survivability  degradation  beyond  100%  RNL  differs  according  to  restoration 
schemes.  The  quickest  increase  on  the  expected  lost  flow  is  observed  for  the  MF-ETE  restoration, 
while  the  KSP-LINE  restoration  results  in  the  smallest  increase  rate.  Since  an  MF-ETE-based  net¬ 
work  can  attain  the  highest  degree  of  sharing  in  spare  capacity  (DSSC)  at  the  link  capacity  design 

5.  Recall  that  a  RNL  is  defined  to  be  100%  if  the  offered  load  is  same  as  that  projected  at  the  design  phase  of  the  net¬ 
work. 

6.  As  before,  the  end-to-end  traffic  requirements  of  each  random  input  deviate  at  most  10  percent  from  those  of  the  base 
demand  of  the  load.  They  are  uniformly  distributed  over  this  range  while  the  total  amount  of  traffic  demand  is  fixed. 
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(a)  MF-LINE  scheme 


(b)  MF-ETE  scheme 
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(c)  KSP-LINE  scheme 


Figure  4-4„  Attainable  survivability  measure  for  various  offered  network  load 
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phase,  it  assigns  the  least  redundant  capacity  over  the  network.  As  a  result,  the  degradation 
becomes  more  prominent  once  the  load  exceeds  the  projected  level.  On  the  other  hand,  a  KSP- 
LINE-based  network  has  the  most  redundant  capacity,  and  thus  it  is  less  sensitive  to  a  demand 
increase.  A  similar  argument  holds  for  the  degradation  due  to  demand  fluctuation.  Table  4-6  shows 
that  an  MF-ETE-based  network  incurs  more  data  loss  at  the  time  of  a  network  failure. 

Network  models  also  determine  an  expected  lost  flow  over  100%  RNL.  For  example,  the  (12,50) 
network  shows  the  steepest  growth.  The  (24,60)  network  has  a  relatively  lower  increase  rate  if  the 
MF-LINE  restoration  scheme  is  applied,  while  the  increase  is  rather  quick  with  the  MF-ETE  resto¬ 
ration  scheme.  These  phenomena  can  also  be  explained  by  the  level  of  redundant  capacity 
assigned  at  the  design  phase:  The  networks  with  relatively  smaller  spare  capacity  are  more  vulner¬ 
able  to  network  failure  at  a  higher  load.  Such  a  relationship  between  the  amount  of  spare  capacity 
and  a  lost  flow  is  generally  observed  at  105%  RNL,  as  illustrated  in  Figure  4-5.  The  (24,60)  net¬ 
work  requires  considerably  less  spare  capacity  if  the  MF-ETE  restoration  is  adopted.  Therefore, 
the  survivability  degradation  rates  over  100%  RNL  substantially  differ  with  the  two  restoration 
schemes  for  the  (24,60)  network. 

When  the  load  grows  further,  a  huge  loss  of  data  becomes  unavoidable  upon  link  failure.  At 


Table  4-6.  An  expected  lost  flow  versus  maximum  demand  fluctuation 


Maximum  Demand  Fluctuation 

5% 

10% 

15% 

20% 

MF-ETE 

NJ-LATA  network  model 

0.60 

1.19 

1.56 

2.04 

(12,50)  network  model 

0.83 

1.81 

2.67 

3.60 

(24,60)  network  model 

0.26 

0.50 

0.76 

1.02 

US-WAN  network  model 

0.17 

0.31 

0.49 

0.62 

MF-LINE 

NJ-LATA  network  model 

0.24 

0.63 

0.96 

1.51 

(12,50)  network  model 

0.06 

0.25 

0.53 

0.93 

(24,60)  network  model 

0.05 

0.14 

0.20 

0.28 

US-WAN  network  model 

0.05 

0.13 

0.20 

0.28 

KSP-LINE 

NJ-LATA  network  model 

0.29 

0.70 

0.92 

1.48 

(12,50)  network  model 

0.02 

0.06 

0.14 

0.26 

(24,60)  network  model 

0.00 

0.00 

0.01 

0.02 

US-WAN  network  model 

0.03 

0.06 

0.10 

0.14 

Thirty  random  demands  are  used  in  the  experiment.  A  sample  mean  of  30  observations  is  reported. 
An  average  expected  lost  flow  is  expressed  in  NL  (normalized  loss) 
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Required  Spare  Capacity  Cost  (NC) 
(c)  KSP-LINE  scheme 


Figure  4-5.  A  lost  flow  at  105%  RNL  versus  normalized  spare  capacity  cost 

The  UF  demand  pattern  is  employed  as  a  base  traffic  load.  The  normalized  lost  flow  is  a  sample  mean 
over  30  samples  at  105%  RNL;  one  base  demand  plus  29  randomly  fluctuated  demands.  End-to-end 
traffic  requirements  of  each  random  demand  deviate  at  most  10  percent  around  the  base  demand  and  are 
uniformly  distributed  over  this  range. 
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110%  RNL,  the  expected  lost  flow  becomes  as  much  as  40  NL.  As  discussed  above,  though,  this  is 
not  in  the  normal  operating  region.  The  load  may  increase  up  to  this  range,  but  this  must  happen 
very  infrequently,  possibly  due  to  an  occasional  surge  of  traffic  or  multiple  network  failures.  How¬ 
ever,  a  higher  level  of  network  survivability  cannot  be  assured  if  the  load  stays  in  this  region  due  to 
growing  traffic  demand.  Therefore,  the  FN  manager  must  be  triggered  to  install  additional  network 
resources  before  the  traffic  increases  to  this  level.  As  mentioned  above,  a  lost  flow  develops  very 
quickly  for  the  networks  with  a  higher  DSSC.  This  suggests  that  the  design  process  for  an  addi¬ 
tional  capacity  installation  should  be  planned  at  an  earlier  stage  of  the  demand  growth  in  such  a 
network  in  order  to  maintain  a  desired  survivability  measure. 

4.5.2.  Effectiveness  of  dynamic  VP  reconfiguration 

4.5.2.I.  Comparison  with  a  static  routing  scheme 

We  propose  a  dynamic  virtual  path  routing  scheme  in  order  to  attain  better  survivability  in 
response  to  demand  dynamics.  Previous  works  in  self-healing  networks  assume  a  predetermined 
flow  assignment.  This  section  manifests  the  effectiveness  of  the  proposed  dynamic  VP  reconfigu¬ 
ration  scheme  through  comparative  analysis  with  a  static  routing  scheme. 

In  contrast  to  a  dynamic  routing  scheme,  a  static  routing  scheme  utilizes  predetermined  routes 
and  bandwidth  allocation  strategy.  Control  simplicity  is  a  major  advantage  on  a  static  routing 
scheme,  but  it  comes  at  the  expense  of  survivability  and  network  utilization.  The  static  routing 
scheme  employed  in  this  experiment  exploits  the  information  on  the  flow  assignment  obtained  in 
the  network  design.  Recall  that  the  joint  optimization  procedure  of  the  SCFA  outputs  not  only  a 
capacity  allocation  but  also  an  optimal  virtual  path  routing  and  bandwidth  assignment  for  a  pro¬ 
jected  traffic  demand.  In  the  static  routing  scheme,  the  traffic  demand  of  each  commodity  is  sent 
over  the  same  routes  as  used  in  the  optimal  VP  assignment.  If  multiple  routes  are  employed  for  a 
commodity,  its  demand  is  distributed  over  the  routes  at  the  same  ratio  as  obtained  in  the  design 
phase.  Full  restorability  is  assured  in  this  assignment  if  a  projected  traffic  demand  is  offered  to  the 
network.  In  the  following  discussion,  this  static  routing  scheme  is  referred  to  as  the  SSR  (surviv- 
able  static  routing)  scheme,  while  our  proposed  routing  method  is  called  SDR  (survivable  dynamic 
routing)  scheme^. 

Thirty  random  demands  deviating  around  the  projected  traffic  demand  (100%  RNL)  are  exam- 
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ined  in  the  experiment.  Each  random  input  has  at  most  10  or  20%  demand  deviation  from  the  base 
traffic  demand.  The  average  expected  lost  flow  due  to  the  SSR  and  SDR  schemes  is  reported  in 
Table  4-7.  The  table  also  lists  the  average  survivability  gain.  It  is  defined  by 
^^SSR~  ^SDR^  ^^SSR  where  L^sr  ^SDR  ‘U’®  the  average  expected  lost  flow  attained  by  the 
SSR  and  SDR  schemes,  respectively*.  The  MFD  method  is  assumed  for  the  dynamic  routing  strat¬ 
egy  for  KSP-LINE-based  networks.  In  some  cases,  the  traffic  demand  cannot  even  be  accommo¬ 
dated  with  the  static  routing  scheme  due  to  capacity  constraints.  Such  a  flow  is  counted  lost  in  the 
above  calculation. 

A  significant  improvement  on  the  survivability  measure  is  attained  by  dynamic  routing.  More 
than  50  percent  survivability  gain  is  observed  in  most  cases.  Out  of  the  three  types  of  networks,  an 
MF-ETE-based  network  turns  out  to  have  a  relatively  low  survivability  gain.  As  discussed  in  Sec¬ 
tion  4.5.1,  a  lost  flow  in  such  a  network  could  be  higher  than  the  other  cases  even  with  dynamic 
routing  due  to  a  smaller  amount  of  spare  bandwidth  installation.  Furthermore,  the  efficiency  of 
end-to-end  restoration  could  make  restorability  less  sensitive  to  flow  assignment.  As  a  result,  a  low 
survivability  gain  follows,  although  as  much  as  5  NL  lost  flow  could  be  saved. 

On  the  other  hand,  the  highest  survivability  gain  can  mostly  be  attained  with  an  MF-LINE-based 
network.  This  phenomena  can  be  explained  by  completely  opposite  reasoning.  A  larger  amount  of 
spare  capacity  enables  recovering  more  traffic  even  with  demand  fluctuation  for  dynamic  routing. 
On  the  other  hand,  the  line  restoration  scheme  is  not  so  efficient  as  the  end-to-end  restoration 
scheme  in  terms  of  restoration  capability.  Therefore,  restorability  becomes  more  sensitive  to  flow 
assignment.  As  a  result,  it  could  save  a  lost  flow  as  much  as  9  NL  on  average. 

Lost  flow  is  not  so  significant  for  a  KSP-LINE-based  network  for  both  routing  strategies.  Rela¬ 
tively  more  spare  capacity  is  installed  in  a  KSP-LINE-based  network  than  any  other  type  of  net¬ 
works.  This  is  not  only  because  KSP  restoration  requires  more  redundant  bandwidth  for  data 
recovery,  but  also  because  the  capacity  assignment  for  such  networks  has  been  calculated  through 
heuristics.  The  result  indicates  that  this  additional  capacity  gives  enough  room  to  reduce  a  lost 
flow  even  with  static  routing.  With  dynamic  routing,  a  lost  flow  becomes  nominal  for  a  KSP- 

7.  Note  that  the  SSR  is  based  on  the  output  of  the  survivable  network  design.  Therefore,  the  scheme  is  expected  to  attain 
a  higher  survivability  measure  than  a  routing  scheme  without  any  consideration  of  network  survivability,  such  as  the 
dynamic  routing  scheme  discussed  in  the  next  section. 

8.  The  hat  sign  means  a  sample  mean  of  the  30  observations. 
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Table  4-7.  Comparison  between  SSR  and  SDR  schemes  1. 100  RNL 


Network  Model 

NJ-LATA 

(12,50) 

(24, 

60) 

US-\ 

VAN 

Maximum  deviation 

10% 

20% 

10% 

20% 

10% 

20% 

10% 

20% 

Average  loss  by  SSR®  (NL) 

2.17 

4.45 

8.70 

0.75 

0.38 

w3Um 

Average  loss  by  SDR  (NL) 

1.19 

1.81 

0.50 

0.31 

t 

& 

Survivability  gain  (%) 

45.0 

54.1 

48.2 

58.6 

33.4 

19.9 

Spare  capacity  (NC)*’ 

52.86 

25.31 

51 

Average  loss  by  SSR  (NL) 

1.93 

4.30 

2.81 

4.05 

2.55 

2.69 

9.20 

9.51 

Average  loss  by  SDR  (NL) 

1.51 

0.25 

0.93 

0.14 

vL 

Survivability  gain  (%) 

67.3 

64.9 

91.0 

77.1 

94.6 

89.6 

98.6 

s 

Spare  capacity  (NC) 

54.45 

35.52 

67 

65 

Average  loss  by  SSR  (NL) 

1.44 

3.24 

0.34 

0.80 

0.052 

0.16 

0.13 

0.28 

Average  loss  by  SDR  (NL) 

0.70 

1.48 

0.058 

0.26 

0.0015 

0.023 

0.062 

0.14 

1 

cu 

Survivability  gain  (%) 

51.6 

54.5 

82.7 

66.7 

97.2 

85.3 

50.5 

50.4 

Spare  capacity  (NC) 

62.41 

43.99 

102.58 

90.87 

a.  The  row  entries  of  ‘average  loss’  list  the  expected  lost  flow  averaged  over  30  random  samples. 

b.  Spare  capacity  installed  in  the  network  in  terms  of  normalized  capacity  (NC). 


Table  4-8.  Comparison  between  SSR  and  SDR  schemes  2. 105  RNL 


Network  Model 

NJ-LATA 

(12,50) 

(24,60) 

US-\ 

VAN 

Maximum  deviation 

10% 

20% 

10% 

20% 

10% 

20% 

10% 

20% 

Average  loss  by  SSR  (NL) 

15.79 

18.77 

28.78 

35.59 

17.37 

18.23 

11.49 

12.42 

Average  loss  by  SDR  (NL) 

9.92 

10.65 

17.28 

12.47 

12.99 

8.38 

8.63 

1 

Survivability  gain  (%) 

37.2 

43.3 

39.9 

28.2 

28.8 

27.0 

■a 

Spare  capacity  (NC) 

52.86 

25.31 

51 

86 

Average  loss  by  SSR  (NL) 

12.87 

13.40 

15.60 

7.25 

7.33 

16.75 

17.11 

s 

Average  loss  by  SDR  (NL) 

9.66 

5.47 

5.46 

3.57 

3.64 

l-J 

1 

[L, 

Survivability  gain  (%) 

35.6 

27.9 

34.1 

24.6 

25.4 

78.7 

78.7 

S 

Spare  capacity  (NC) 

54.45 

35.52 

67 

65 

Average  loss  by  SSR  (NL) 

IBI 

10.65 

7.25 

7.90 

2.91 

2.98 

5.11 

Average  loss  by  SDR  (NL) 

7.33 

8.29 

5.61 

6.22 

0.91 

0.97 

2.63 

2.68 

oil 

Survivability  gain  (%) 

Bl 

22.2 

22.6 

21.3 

68.6 

67.3 

47.4 

47.7 

Spare  capacity  (NC) 

62.41 

43.99 
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LINE-based  network. 


We  have  further  carried  out  the  same  experiments  at  105%  RNL  to  examine  the  effectiveness  of 
the  SDR  scheme  at  a  higher  load.  The  results  are  summarized  in  Table  4-7.  As  expected,  a  larger 
amount  of  lost  flow  can  be  avoided  by  the  SDR  scheme  than  the  case  at  100%  RNL.  Due  to  a 
higher  load  offered  to  the  network,  a  dynamic  flow  adjustment  is  more  effective  to  reduce  an 
expected  lost  flow.  However,  the  average  survivability  gain  decreases  because  more  lost  flow  is 
inevitable  even  with  the  SDR  scheme. 

4.5.2.2.  Comparison  with  an  existing  dynamic  routing  scheme 

This  section  explores  the  effectiveness  of  the  proposed  routing  scheme  through  a  comparison 
with  an  existing  dynamic  routing  strategy.  Recall  that  no  existing  routing  scheme  has  aimed  at  pro¬ 
moting  the  restorability  of  a  self-healing  protocol.  The  objective  of  this  comparative  analysis  is  to 
examine  the  improvement  of  attainable  restorability  by  the  introduction  of  the  survivable  control 
at  the  VP  layer.  Furthermore,  we  attempt  to  clarify  how  the  SDR  scheme  can  enhance  the  surviv¬ 
ability  measure  by  analyzing  the  difference  in  the  obtained  flow  assignments. 

Out  of  possible  candidates  for  a  comparison,  we  have  chosen  the  traditional  dynamic  routing 
strategy  in  packet  switching  networks  where  average  network  delay  is  minimized.  A  flow  devia¬ 
tion  method  can  be  employed  to  solve  this  minimization  problem  [22].  This  scheme  is  referred  to 
as  the  DR  (dynamic  routing)  scheme  in  the  following  discussion.  A  close  relation  between  average 
delay  and  buffer  overflow  probability  suggests  that  the  DR  scheme  effectively  reduces  the  cell 
level  QOS  parameter  (cell  loss  probability)  to  a  near-optimal  level  by  minimizing  the  average 
delay.  In  fact,  this  approximation  has  been  used  before  in  the  context  of  dynamic  ATM  path  recon¬ 
figuration  [62].  Furthermore,  the  DR  scheme  could  avoid  more  lost  flow  than  other  traditional 
routing  strategies.  The  scheme  tends  to  balance  the  traffic  load  over  the  network,  and  thus  spare 
bandwidth  is  allocated  evenly  over  the  network.  Therefore,  the  results  obtained  through  compara¬ 
tive  analysis  between  the  DR  and  SDR  schemes  would  show  performance  improvement  over  one 
of  the  best  possible  routing  schemes  proposed  so  far. 

In  the  experiments,  the  base  traffic  demands  are  uniformly  increased  from  90  to  110%  RNL. 
Thuly  random  demand  patterns  are  utilized  for  each  load.  The  random  inputs  deviate  at  most  by  10 
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or  20  percent  around  the  base  demand  and  are  uniformly  distributed  over  the  range.  The  KSP- 
LINE-based  network  is  examined  in  this  study.  The  MF-based  scheme  requires  an  additional  pre¬ 
planning  process  to  find  a  restoration  fiow  assignment,  but  such  a  provision  is  not  provided  in  any 
existing  dynamic  routing  strategy. 

Figures  4-6, 4-7  and  4-8  show  the  results  of  the  experiments  on  the  four  sample  networks.  They 
are  based  on  the  uniform  traffic  (UF)  demand  pattern,  but  the  results  over  other  base  demand  pat¬ 
terns  (WT  and  RD  demand  patterns)  turn  out  to  exhibit  a  similar  trend  for  each  network  model. 
This  indicates  that  the  results  have  no  significant  dependency  on  the  base  traffic  demand  as  long  as 
the  link  capacity  is  properly  designed  according  to  the  projected  demand.  In  Figure  4-6,  an 
expected  lost  flow  as  well  as  a  maximum  lost  flow  per  single-link  failure,  plotted 

against  the  offered  load.  Figure  4-7  illustrates  the  improvement  of  survivability  measure  attained 
by  the  SDR  scheme  over  that  of  the  DR  scheme.  An  improvement  on  an  expected  lost  flow  as  well 
as  its  maximum  and  minimum  per  single-link  failure  are  shown  in  the  figure.  Finally,  Figure  4-8 
depicts  the  average  survivability  gain,  ,  where  and  Lj^r  are  the  aver¬ 

age  expected  lost  flow  attained  by  the  SDR  and  DR  schemes,  respectively.  The  results  at  100  and 
105  RNL  are  also  listed  in  Table  4-9. 

The  maximum  improvement  of  a  lost  flow  due  to  a  single  link  failure  amounts  to  5  to  45  NL  at 
100%  RNL,  while  the  average  is  at  most  7  NL.  The  average  gain  is  not  significant  since  with  some 
links  a  flow  can  be  restored  in  either  flow  assignment  scheme  from  their  failure.  For  example,  the 


Table  4-9.  Comparison  between  DR  and  SDR  schemes^ 


-o 

A 

Network  Model 

NJ-LATA 

(12,50) 

(24,60) 

US-WAN 

o 

Maximum  deviation 

10% 

20% 

10% 

20% 

10% 

20% 

10% 

20% 

Average  loss  by  DR®  (NL) 

7.54 

7.68 

4.06 

4.12 

0.94 

1.09 

4.27 

4.21 

i 

Average  loss  by  SDR  (NL) 

0.70 

1.48 

0.058 

0.26 

0.0015 

0.023 

0.062 

0.14 

8 

Survivability  gain  (%) 

90.7 

80.7 

98.6 

93.7 

99.8 

97.9 

98.5 

96.7 

Average  loss  by  SSR  (NL) 

1.44 

3.24 

0.34 

0.80 

0.052 

0.16 

0.13 

0.28 

Average  loss  by  DR  (NL) 

13.85 

13.53 

11.99 

11.88 

2.54 

2.63 

7.10 

6.86 

I 

Average  loss  by  SDR  (NL) 

7.33 

8.29 

5.61 

6.22 

0.91 

0.97 

2.63 

2.68 

o 

Survivability  gain  (%) 

47.1 

38.7 

53.2 

47.6 

64.2 

63.1 

63.0 

60.9 

Average  loss  by  SSR  (NL) 

9.20 

10.65 

7.25 

7.90 

2.91 

2.98 

5.00 

5.11 

a.  The  KSP-LINE  restoration  scheme  is  employed. 

b.  The  row  entries  of  ‘average  loss’  list  the  expected  lost  flow  averaged  over  30  random  samples. 
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(d)  US-WAN  Sample  Network 


Figure  4-6.  Comparison  of  a  iost  fiow  in  the  SDR  and  DR  schemes 


Thirty  random  demands,  which  deviate  at  most  10  percent  from  the  base  demand,  are  employed  at 
each  load 
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(d)  US-WAN  Sample  Network 


Figure  4-7.  Improvement  of  survivability  measure  by  SDR  over  DR 

Thirty  random  demands,  which  deviate  at  most  10  percent  from  the  base  demand,  are  employed 
at  each  load 
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links  1-5,  1-8  and  5-8  in  the  NJ-LATA  network  have  many  disjoint  restoration  routes  (see  Figure 
3-4),  and  it  is  observed  that  a  lost  flow  due  to  a  failure  of  these  links  is  zero  in  either  approach.  On 
the  other  hand,  significant  improvement  is  observed  in  a  failure  of  the  link  adjacent  to  a  node  with 
degree  2,  such  as  links  5-7  and  7-8  in  the  NJ-LATA  network  model.  If  such  a  link  fails,  all  active 
flow  must  be  rerouted  over  its  adjacent  link.  Thus,  relatively  more  spare  capacity  must  be  allocated 
over  these  links  in  order  to  achieve  a  high  restorability.  The  DR  scheme  fails  to  assign  an  extra 
spare  capacity  to  such  links  because  it  tries  to  balance  the  load  uniformly  over  the  network.  The 
SDR  scheme,  on  the  other  hand,  can  allocate  the  traffic  in  such  a  way  that  a  flow  would  not  be  lost 
in  such  links.  In  other  words,  our  algorithm  can  detect  the  links  vulnerable  to  a  failure  under  the 
current  traffic  demand  patterns  and  avoid  allocating  excessive  flow  on  these  links  in  order  to  pro¬ 
mote  restorability  over  the  network. 

The  attainable  survivability  measure  and  improvement  also  differ  according  to  the  network 
topology.  Out  of  the  four  sample  networks,  the  (24,60)  network  produces  the  least  lost  flow  and 
attains  the  least  improvement  by  the  SDR  scheme.  On  the  other  hand,  a  relatively  large  average 
lost  flow  is  observed  in  the  (12,50)  and  NJ-LATA  networks.  The  amount  of  installed  spare  capacity 
again  explains  the  difference  of  an  average  lost  flow:  A  network  with  less  spare  bandwidth  results 
in  a  larger  expected  lost  flow  at  a  higher  load.  However,  the  maximum  improvement  per  failure  is 
not  so  large  in  the  (12,50)  network  as  in  the  NJ-LATA  and  US-WAN  networks,  although  the  least 
spare  capacity  is  installed  in  the  (12,50)  network.  This  is  because  the  (12,50)  network  does  not 
include  a  node  with  degree  2.  As  mentioned  above,  the  links  adjacent  to  such  a  node  make  the  net¬ 
work  vulnerable  to  a  more  damaging  failure.  The  DR  scheme  fails  to  avoid  allocating  an  excessive 
flow  over  the  links,  and  thus  a  failure  of  those  links  significantly  contributes  to  a  large  improve¬ 
ment  by  the  SDR  scheme. 

When  the  load  is  not  heavy,  the  SDR  scheme  works  very  well  to  prevent  data  loss.  The  maxi¬ 
mum  lost  flow  is  almost  zero  with  100%  or  less  RNL,  implying  that  virtually  no  flow  is  lost  with 
any  single-link  failure  at  this  range  of  operation.  On  the  other  hand,  a  flow  as  much  as  40  NL  can 
be  lost  by  the  DR  scheme.  As  discussed  before,  the  region  with  100%  or  less  RNL  is  the  one 
encountered  in  normal  operating  conditions,  and  our  approach  almost  achieves  100%  survivability 
gain  over  this  range.  Even  if  the  demand  occasionally  increases  slightly  beyond  the  projected 
level,  the  SDR  scheme  can  prevent  most  of  the  lost  flow.  For  example,  about  80%  of  the  lost  flow 
can  be  saved  when  the  demand  increases  by  2.5%. 
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When  the  load  increases  further,  the  survivability  gain  becomes  small,  about  50%  with  the  load 
at  105%  RNL  and  around  30%  at  110%  RNL.  At  this  region  of  the  load,  the  minimum  improve¬ 
ment  becomes  negative.  Namely,  the  flow  due  to  the  DR  scheme  produces  a  smaller  lost  flow  than 
that  due  to  the  SDR  scheme  for  some  link  failures.  This  is  possible  since  the  SDR  procedure  opti¬ 
mizes  the  expected  lost  flow  over  all  possible  single  link  failures.  Lost  flow  is  inevitable  against 
any  link  failure  when  the  load  gets  heavy.  Thus,  it  becomes  almost  impossible  to  find  a  flow  which 
is  more  restorable  than  others  against  any  link  failure.  On  average,  our  approach  still  gives  some 
improvement  over  the  DR  scheme.  As  discussed  before,  110%  RNL  is  not  in  the  normal  operating 
region.  If  the  load  stays  in  this  region  due  to  growing  traffic  demand,  data  loss  becomes  unavoid¬ 
able  with  any  routing  strategy.  A  failure  at  the  most  vulnerable  link  entails  a  huge  service  interrup¬ 
tion.  For  example,  41  NL  may  be  lost  for  some  link  failures  in  the  NJ-LATA  network.  In  other 
words,  the  load  is  too  heavy  to  prevent  large  loss  with  the  available  network  resources.  Thus,  addi¬ 
tional  resources  must  be  installed  before  the  load  increases  to  this  level. 


4.5.3.  Restorability  of  three  VP  restoration  schemes 

This  section  reports  the  results  of  comparative  analysis  on  the  restorability  of  the  three  fast  resto¬ 
ration  schemes:  MF-LINE,  MF-ETE  and  KSP-LINE.  Attainable  survivability  measures  are  com¬ 
pared  under  the  same  network  environment:  the  same  capacity  placement  and  offered  load.  The 
capacity  assignment  is  calculated  by  the  SCFA-MF-ETE  scheme  for  a  given  projected  traffic 
demand^.  A  flow  assignment  is  optimized  using  the  respective  VP  reconfiguration  procedures,  and 


Figure  4-8.  Survivability  gain  of 
SDR  over  DR 


9.  The  UF  demand  pattern  is  used  in  this  experiment. 
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the  restoration  is  performed  by  the  corresponding  restoration  schemes.  Figure  4-9  summarizes  the 
results  of  this  study  for  four  sample  networks.  The  load  is  increased  from  90%  to  110%  RNL,  and 
only  a  base  demand  is  employed  at  each  load. 

As  expected,  the  MF-ETE  scheme  attains  the  highest  restorability,  while  the  KSP-LINE  scheme 
suffers  from  the  largest  loss  of  data.  However,  the  relative  performance  of  each  scheme  signifi¬ 
cantly  differs  according  to  network  topologies.  For  example,  the  MF-ETE  scheme  has  a  great 
advantage  over  the  other  two  schemes  in  the  (24,60)  network,  while  the  difference  between  the 
MF-LINE  and  the  KSP-LINE  schemes  is  insignificant.  At  100%  RNL,  the  MF-ETE  scheme  could 
save  about  42  NL  and  45  NL  lost  flow  over  the  MF-LINE  and  KSP-LINE  schemes,  respectively. 
The  NJ-LATA  network,  on  the  other  hand,  does  not  give  a  substantial  benefit  on  the  MF-ETE 
scheme:  A  lost  flow  conserved  over  the  MF-LINE  is  only  about  4  NL  at  100%  RNL. 

Regularity  and  connectivity  of  the  network,  which  have  been  widely  discussed  in  the  previous 
section,  again  account  for  these  phenomena.  Since  the  (24,60)  network  is  well-balanced,  the  end- 
to-end  restoration  scheme  can  distribute  restoration  flow  evenly  over  the  network.  In  addition,  the 
sparseness  and  low  node  connectivity  of  the  network  make  the  line  restoration  scheme  more  ineffi¬ 
cient.  Since  all  restoration  flow  must  be  sent  between  two  nodes  adjacent  to  a  failed  link,  a  restor- 
able  amount  of  flow  is  restricted  to  the  total  available  spare  bandwidth  leaving  the  nodes.  A 
network  with  a  lower  node  degree  is  severely  affected  by  this  bandwidth  restriction,  imposing  a 
significant  penalty  on  the  restorability  of  the  line  restoration  scheme.  Furthermore,  such  a  network 
is  more  susceptible  to  the  effect  of  backhauling  and  looping,  as  discussed  in  the  previous  section. 
In  the  end,  the  MF-ETE  scheme  achieves  a  substantial  improvement  for  the  (24,60)  network  not 
only  due  to  its  effectiveness  but  also  due  to  the  inefficiency  of  the  line  restoration  scheme  for  the 
network.  On  the  other  hand,  since  the  NJ-LATA  network  is  well-connected,  the  locality  of  restora¬ 
tion  flow  would  not  place  a  large  penalty  on  the  line  restoration  scheme.  Irregularity  of  the  net¬ 
work  topology  might  diminish  the  benefit  of  the  end-to-end  restoration  scheme.  Therefore,  the 
MF-ETE  scheme  is  less  attractive  for  the  NJ-LATA  sample  network. 

4.5.4.  The  effect  of  two-step  restoration 

Two-step  restoration  aims  at  not  only  performing  a  quick  recovery  from  a  network  failure  but 
also  obtaining  an  optimal  flow  assignment  against  a  possible  subsequent  network  failure.  Net¬ 
work-wide  restoration  (NWR)  is  performed  to  realize  an  optimal  VP  assignment  after  temporal 
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Figure  4-9.  Restorabiiity  of  the  three  SVPR  schemes 


service  interruption  is  recovered  by  fast  restoration  mechanisms.  This  section  investigates  the 

effect  of  the  two-step  restoration  scheme  through  a  comparison  of  attainable  survivability  with  and 

without  the  NWR  after  a  failure.  An  expected  lost  flow  due  to  two  successive  link  failures  is  used 

to  measure  the  performance.  All  possible  combinations  of  link  failures  are  considered,  and  they 

are  assumed  to  be  equally  likely  to  occur.  Formally,  an  expected  lost  flow  due  to  two  successive 
2 

link  failures,  L  (f)  ,  is  defined  as  follows: 

'  '/€  E 

i?(t)  =/,;(()+ igL  X  Li’-a,) 

'  '  eeE,e*l 

where  two  complete  link  cuts  are  assumed  to  happen  at  links  I  and  e  consecutively.  (f)  is  a  lost 

flow  upon  restoration  of  the  first  failure  at  link  /,  where  f  is  a  flow  assignment  before  the  failure. 
2  . 

it [)  is  a  lost  flow  due  to  the  second  failure  at  link  e  which  happens  under  a  new  topology 
without  link  /  and  under  a  new  flow  assignment,  f ^ is  a  flow  obtained  after  a  failure  of  link  /. 
Note  that  takes  a  different  value  depending  on  the  availability  of  the  NWR  function.  If  some 
amount  of  flow  cannot  be  recovered  upon  restoration  of  the  first  failure,  a  part  of  the  VP  demand 
cannot  be  satisfied  after  the  failure.  In  such  a  case,  a  part  of  the  requested  VP  bandwidth  must  be 
rejected.  If  the  NWR  is  applied,  however,  the  flow  assignment  f;  largely  depends  on  how  the 
demand  is  discarded.  In  the  calculation  of  f^,  we  assume  that  this  demand  rejection  takes  place 
uniformly  over  the  affected  VP’s.  Finally,  (f)  gives  an  expected  lost  flow  due  to  two  succes¬ 
sive  link  failures,  assuming  the  first  failure  occurs  at  link  /. 

The  relative  network  load  from  80  to  105  percent  is  explored  in  the  experiment.  Only  the  base 
demand  is  used  at  each  load.  The  initial  fiow  before  the  first  failure,  f ,  is  assumed  to  be  computed 
by  the  proposed  VP  assignment  procedures.  The  three  restoration  strategies  are  considered,  and  the 
MFD  method  is  employed  in  the  experiments  on  KSP-LINE-based  networks.  After  fast  restoration 
from  a  failure,  some  of  the  restored  VP’s  might  be  routed  back  to  the  link  previously  encountered 
in  the  case  of  line  restoration  (backhauling).  For  example,  a  VP  with  a  route  2-3-4  in  the  US-WAN 
network  (see  Figure  3-4)  may  be  rerouted  over  path  2-3-2-7-8-4  upon  a  link  3-4  failure,  and  the 
path  goes  back  and  forth  over  the  link  2-3.  For  KSP-LINE-based  networks,  the  NWR  procedure 
removes  these  unnecessary  loops  before  triggering  the  optimization  process.  Then,  the  procedure 
uses  the  resulting  flow  assignment  as  its  initial  solution.  This  preprocessing  is  desirable  since  such 
loops  may  not  be  removed  completely  by  the  algorithm.  Since  the  algorithm  usually  finds  a  new 
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flow  along  the  line  between  the  current  flow  and  a  shortest  route  flow,  some  amount  of  bandwidth 
tends  to  remain  allocated  over  the  loops.  In  addition,  it  is  beneficial  to  get  rid  of  these  obviously 
inefficient  routes  beforehand  because  the  iterations  needed  for  their  removal  can  be  eliminated. 
This  preprocessing  is  unnecessary  for  MF-LINE-based  networks  since  the  solution  procedure  does 
not  use  the  current  solution  as  a  starting  point. 

2 

The  results  are  illustrated  in  Figures  4-10  to  4-18.  An  average  normalized  lost  flow,  L  (f)  ,  is 
depicted  in  Figures  4-10,  4-13  and  4-16  for  both  cases  with  and  without  the  NWR  function.  Fig¬ 
ures  4-11,  4-14  and  4-17  demonstrate  the  improvement  of  survivability  measure  with  the  NWR 
function.  The  maximum  improvement  on  (I  e  E)  is  depicted  in  addition  to  the  average 
improvement  on  .  Finally,  Figures  4-12, 4-15  and  4-18  show  the  average  survivability  gain  due 
to  the  NWR,  given  by  ^WO 

without  the  NWR,  respectively.  The  figures  also  plot  the  maximum  survivability  gain  at  each  load, 
which  is  defined  by  max  {  vvo  "  ^ WO^  '  W  WO  expected  lost  flow, 

,  due  to  a  failure  of  link  /  and  another  subsequent  link  failure  with  and  without  the  NWR, 

respectively. 

In  the  case  of  line  restoration-based  networks,  on  average  10  to  30  NL  lost  flow  can  be  avoided 
with  the  NWR  at  100%  RNL,  21  to  68  NL  at  most  (Figures  4-11  and  4-14).  The  maximum  surviv¬ 
ability  gain  is  100%  at  a  lower  load  (Figures  4-12  and  4-15).  This  implies  that  the  NWR  can  com¬ 
pletely  restore  a  flow  against  a  certain  scenario  of  successive  link  failures.  For  example,  suppose 
the  link  5-8  fails  in  the  KSP-LINE-based  NJ-LATA  network  and  the  load  is  below  90%.  If  the 
NWR  function  is  used,  a  flow  can  be  restored  from  any  successive  failure,  while  on  average  6  NL 
flow  is  lost  at  the  time  of  the  second  failure  without  the  NWR  function.  Note  that  the  average  lost 
flow  due  to  two  successive  failures  takes  a  positive  value  even  with  80%  relative  load.  More 
capacity  is  required  to  realize  full  restorability  against  multiple  link  failures. 

The  average  and  maximum  improvement  increases  as  the  load  grows  to  the  projected  level.  Due 
to  the  reduced  amount  of  spare  capacity,  a  flow  must  be  assigned  more  efficiently  at  a  higher  load 
in  order  to  restore  more  bandwidth  against  successive  outages.  This  indicates  that  the  NWR  func¬ 
tion  is  an  effective  way  to  promote  network  survivability  at  the  presence  of  a  network  failure.  Fur¬ 
ther  growth  on  the  network  load,  however,  causes  the  improvement  to  saturate  and  decrease.  At 
this  saturation  level,  data  loss  becomes  inevitable  even  with  the  NWR  function.  Such  saturation 
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occurs  at  a  lower  load  for  the  (12,50)  network  due  to  a  smaller  spare  capacity  installation. 

In  the  case  of  end-to-end  restoration-based  networks,  the  improvement  of  the  survivability  is 
significantly  smaller  (Figure  4-17).  On  average,  6  to  9  NL  lost  flow  can  be  avoided  with  the  NWR 
at  100%  RNL.  Survivability  gain  is  also  considerably  smaller  than  that  for  line  restoration-based 
networks  (Figure  4-18).  End-to-end  restoration  is  immune  to  backhauling  and  can  exploit  the 
spare  bandwidth  more  effectively  than  line  restoration,  even  without  the  NWR  function.  The 
smaller  improvement  is  due  to  this  efficiency  of  the  restoration  mechanism.  Furthermore,  less 
spare  capacity  is  instedled  in  ETE-based  networks,  making  them  more  susceptible  to  multiple  fail¬ 
ures.  Thus,  a  relatively  larger  lost  flow  is  incurred  even  with  NWR.  The  expected  lost  flow  due  to 
two  successive  failures  increases  rather  quickly  at  a  higher  load,  as  observed  in  Figure  4-16. 
Because  of  a  smaller  improvement  and  a  larger  amount  of  data  loss,  the  scheme  can  only  attain  a 
smaller  survivability  gain  than  the  other  schemes. 

4.6.  Summary 

This  section  presents  survivable  virtual  path  routing  problems  for  self-healing  ATM  networks 
based  on  three  fast  restoration  protocols.  A  similar  approach  to  the  SCFA  problem  can  be  applied 
to  the  MF-based  networks,  while  two  novel  algorithms  are  developed  for  the  KSP-based  networks. 
The  modified  flow  deviation  method  is  developed  to  solve  a  non-smooth  multicommodity  flow 
problem.  A  near-optimal  solution  can  be  found  instantly  while  avoiding  kinks  by  adjusting  an  opti¬ 
mization  parameter.  A  Lagrangian  relaxation-based  approach  further  advances  the  quality  of  the 
solution  although  it  requires  more  computation  time.  This  approach  also  helps  to  obtain  a  lower 
bound  estimate  of  the  optimum. 

Numerical  experiments  reveal  that  the  proposed  algorithms  work  very  well  to  avoid  a  lost  flow 
when  a  network  load  is  at  the  designed  level.  The  proposed  approach  can  detect  the  links  that 
make  the  network  vulnerable  to  a  more  damaging  failure  under  the  current  traffic  demand  pattern 
and  adjust  a  flow  so  as  to  improve  the  survivability  measure.  When  the  load  grows  further,  the 
attainable  survivability  measure  depends  on  the  amount  of  spare  capacity  installed  in  the  network 
as  well  as  the  efficiency  of  the  restoration  schemes.  The  two-step  restoration  scheme  is  also  shown 
to  be  very  effective  in  promoting  restorability  after  a  failure.  In  the  previous  section,  the  three  res¬ 
toration  schemes  have  been  evaluated  and  compared  in  terms  of  required  spare  capacity  cost.  The 
results  in  this  section  give  two  additional  factors  in  decision  making  on  the  selection  of  the  fast 
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restoration  schemes;  restorability  and  applicability.  The  comparative  analysis  reveals  that 
restorability  largely  depends  on  the  regularity  and  connectivity  of  a  network.  Applicability  of  the 
MF-ETE  scheme  turns  out  to  be  somewhat  restricted  for  a  large  network,  and  further  speed-up  of 
the  solution  procedure  should  be  addressed  in  future  research. 
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(a)  NJ-LATA  Sample  Network 


(b)  (12,50)  Sample  Network 
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(c)  (24,60)  Sample  Network 


(d)  US-WAN  Sample  Network 


Figure  4-10.  Attainable  survivability  measure  with  and  without  NWR  (MF-LINE) 
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(a)  NJ-LATA  Sample  Network  (b)  (12,50)  Sample  Network 


(c)  (24,60)  Sample  Network 


(d)  US-WAN  Sample  Network 


Figure  4-11.  Improvement  of  survivability  measure  with  NWR  (MF-LINE) 
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(a)  NJ-LATA  Sample  Network  (b)  (12,50)  Sample  Network 


(c)  (24,60)  Sample  Network  (d)  US-WAN  Sample  Network 


Figure  4-12.  Survivability  gain  due  to  NWR  (MF-LINE) 
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(a)  NJ-LATA  Sample  Network 


(b)  (12,50)  Sample  Network 


(c)  (24,60)  Sample  Network  (d)  US-WAN  Sample  Network 


Figure  4-13.  Attainable  survivability  measure  with  and  without  NWR  (KSP-LINE) 


-93- 


Improvement  (NL)  Improvement  (NL) 


Relative  Network  Load  (%) 


Relative  Network  Load  (%) 


(c)  (24,60)  Sample  Network  (d)  US-WAN  Sample  Network 


Figure  4-14.  Improvement  of  survivability  measure  with  NWR  (KSP-LINE) 
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(a)  NJ-LATA  Sample  Network  (b)  ( 1 2,50)  Sample  Network 


(c)  (24,60)  Sample  Network  (d)  US-WAN  Sample  Network 


Figure  4-15.  Survivability  gain  due  to  NWR  (KSP-LINE) 
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(c)  (24,60)  Sample  Network 


(d)  US-WAN  Sample  Network 


Figure  4-16.  Attainable  survivability  measure  with  and  without  NWR  (MF-ETE) 
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(a)  NJ-LATA  Sample  Network  (b)  ( 1 2,50)  Sample  Network 
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(c)  (24,60)  Sample  Network  (d)  US-WAN  Sample  Network 


Figure  4-17.  Improvement  of  survivability  measure  with  NWR  (MF-ETE) 


-97- 


tiurvivablllty  Uain  (SG)  (%)  Survivability  Gain  (SG)  (%) 


(a)  NJ-LATA  Sample  Network 


(b)  (12,50)  Sample  Network 


(c)  (24,60)  Sample  Network 


(d)  US-WAN  Sample  Network 


Figure  4-18.  Survivability  gain  due  to  NWR  (MF-ETE) 


V  Conclusion 


This  concluding  section  summarizes  the  major  contributions  of  this  work  and  suggests  possible 
directions  for  future  research. 

5.1.  Summary  of  Major  Contributions 

The  significant  contributions  of  this  project  are  accomplished  in  the  field  of  survivable  network 
management  for  high-speed  ATM  inter-office  networks.  The  project  has  provided  an  integrated 
view  of  survivable  ATM  network  management  systems  with  fast  virtual  path  restoration  capabili¬ 
ties.  The  two  concepts,  two-step  restoration  and  survivable  ATM  network  management  architec¬ 
ture,  have  been  introduced.  We  have  identified  two  key  open  issues,  the  SCFA  and  SVPR 
problems,  and  have  developed  the  optimization  models  and  the  solution  procedures  for  the  prob¬ 
lems.  The  proposed  algorithms  have  been  applied  to  comprehensive  comparative  analysis  on  the 
fast  restoration  protocols.  Figure  5-1  summarizes  these  contributions. 

The  first  major  contribution  of  this  project  is  the  identification  of  three  key  issues  for  survivable 
ATM  network  management  and  their  consolidation  in  order  to  deploy  an  efficient  and  cost-effec¬ 
tive  survivable  ATM  networks.  The  identified  issues  are  the  fast  restoration  scheme,  the  survivable 
virtual  path  routing  (SVPR)  problem  and  the  survivable  capacity  and  flow  assignment  (SCFA) 
problem.  The  fast  restoration  mechanism  performs  run-time  rerouting  of  affected  virtual  paths  to 
provide  service  continuity,  while  the  SVPR  problem  aims  at  maximizing  restorability  through 
dynamic  virtual  path  reconfiguration  in  response  to  demand  dynamics.  The  SCFA  problem  designs 
a  fully  restorable  facility  network  with  minimum  cost,  with  the  aid  of  a  joint  optimization  proce¬ 
dure.  The  restorability  of  a  fast  restoration  scheme  completely  depends  on  network  design  and  vir¬ 
tual  path  flow  assignment.  Therefore,  these  three  key  issues  should  be  addressed  in  an  integrated 
manner  to  effectively  realize  survivable  networks,  but  all  previous  works  have  overlooked 
addressing  the  issues  in  this  way.  These  survivability  functions  are  integrated  into  an  existing  ATM 
network  resource  architecture,  leading  to  the  survivable  ATM  network  management  architecture 
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proposed  in  this  project.  The  complexity  of  the  resource  management  process  is  simplified  by  clas¬ 
sifying  different  levels  of  network  resources  and  traffic  entities  into  layers.  Survivability  functions 
are  effectively  consolidated  into  the  virtual  path  layer  and  facility  network  layer. 

Another  fundamental  contribution  of  this  project  is  the  introduction  of  the  two-step  restoration 
concept  in  order  to  enhance  an  attainable  survivability  level  after  a  failure.  It  accommodates  two 
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I  ATM  resource  management  architecture  I 


Figure  5-1 .  Major  contributions  of  the  project 
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contradicting  requirements  when  a  network  failure  happens:  fast  restoration  and  optimal  vurtual 
path  reconfiguration.  The  scheme  has  been  proven  to  effectively  prevent  lost  flow  due  to  a  subse¬ 
quent  failure,  especially  in  line  restoration-based  networks. 

Optimization  models  have  been  developed  for  the  SCFA  and  SVPR  problems.  The  problem  for¬ 
mulation  differs  depending  on  the  fast  restoration  scheme  employed  in  the  network.  Three  repre¬ 
sentative  fast  restoration  schemes  have  been  identified,  and  distinct  solution  procedures  have  been 
proposed  accordingly.  The  problems  can  be  formulated  as  a  large  scale  linear  programming  prob¬ 
lem  for  MF-based  networks.  In  order  to  overcome  a  huge  number  of  constraints  inherent  to  the 
problem,  several  mechanisms  have  been  utilized,  such  as  a  column  generation  technique  and  arc- 
chain  flow  representation  which  help  to  reduce  the  number  of  constraints  to  a  manageable  size. 
Out  of  the  employed  techniques,  the  original  ideas  are  to  notice  the  basis  matrix  arrangement  and 
to  develop  the  direct  Lu  submatrix  update  mechanism.  The  identified  basis  matrix  arrangement 
facilitates  the  LU  decomposition  of  the  basis  matrix,  and  its  validity  has  been  proven  in  the  project. 
These  two  mechanisms  significantly  economize  the  computation  of  the  algorithm.  Row  generation 
and  deletion  mechanisms  have  been  further  devised  to  cope  with  the  explosive  number  of  con¬ 
straints  for  the  ETE-based  networks. 

The  SVPR-KSP-LINE  problem  has  been  formulated  as  a  non-differentiable  multicommodity 
flow  problem  with  linear  constraints.  The  modified  flow  deviation  method  is  another  significant 
contribution  of  this  project.  The  premature  convergence  around  kinks  which  is  inherent  to  non¬ 
smooth  optimization  could  be  avoided  by  properly  adjusting  optimization  parameters.  It  has  been 
shown  that  the  procedure  quickly  converges  to  a  near-optimal  solution.  Further  lost  flow  preven¬ 
tion  could  be  attained  by  a  variation  of  the  Lagrangian  relaxation  technique  proposed  in  this 
project,  but  it  comes  at  the  expense  of  the  computational  time.  This  procedure  also  helps  in  finding 
a  tighter  lower  bound  estimate. 

Finally,  the  performance  evaluation  of  the  proposed  algorithms  has  not  only  verified  the  effec¬ 
tiveness  of  the  proposed  survivable  management  systems,  but  also  has  given  an  insight  into  the 
benefit  of  each  restoration  scheme.  The  economical  benefit  due  to  the  end-to-end  restoration 
scheme  has  been  especially  investigated,  and  the  effect  of  network  topology  has  been  examined. 
Previous  comparative  studies  have  only  been  made  qualitatively  or  through  a  simulation  study. 
With  the  proposed  optimization  procedures,  we  have  clarified  for  the  first  time  their  pros  and  cons 
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in  terms  of  the  optimal  capacity  installation  cost.  Contrary  to  a  wide  belief  in  the  economic  advan¬ 
tage  on  the  end-to-end  restoration  scheme,  our  comprehensive  study  has  revealed  that  attainable 
gain  could  be  nominal  for  well-connected  and/or  unbalanced  networks. 

5.2.  Future  Extensions 

This  is  the  first  work  on  survivable  ATM  network  management  where  fast  restoration  mecha¬ 
nisms  are  taken  into  account.  We  have  addressed  the  most  fundamental  and  important  aspects  of 
the  problem  in  this  project.  However,  there  are  still  several  open  issues  that  require  further 
research.  We  conclude  the  project  by  attempting  to  identify  some  important  issues  to  be  addressed 
as  an  extension  of  the  research. 

•  Topological  effect  and  topological  design:  Further  investigation  would  be  necessary  to  clarify 
the  effect  of  network  topology  on  the  required  spare  capacity  cost  as  well  as  restorability.  This 
study  would  be  useful  for  a  topological  design  which  is  required  to  constmct  a  new  transmis¬ 
sion  network.* 

•  Cost  function:  In  the  SCFA  problem,  we  assume  a  linear  cost  function.  Its  major  benefit  is  the 
substantial  simplification  of  the  solution  procedure,  which  accounts  for  the  popularity  of  the 
linear  approximation  in  most  previous  works.  However,  other  cost  functions  might  be  prefera¬ 
ble  if  more  precise  approximation  of  the  installation  cost  is  necessary.  For  example,  a  logarith¬ 
mic  cost  function  can  accommodate  the  following  effect  of  large-scale  economy:  As  the 
channel  size  increases,  the  incremental  cost  per  bit  decreases  [44]. 

•  Failure  type:  In  this  work,  a  complete  span  cut  has  been  considered  as  the  most  plausible  fail¬ 
ure  event,  and  the  survivable  network  management  scheme  has  been  developed  based  on  this 
assumption.  Other  types  of  fciilure  scenarios  should  be  taken  into  account  to  enhance  the  net¬ 
work  survivability,  if  their  likelihood  is  not  small  enough  to  neglect.  Out  of  such  candidates,  a 
node  failure  would  be  the  most  important  event  to  consider  in  future  research. 

•  Speed-up:  Further  work  is  required  in  the  development  of  the  SVPR-MF-ETE  solution  proce¬ 
dures  in  order  to  complete  within  a  reasonable  computation  time  at  a  higher  load.  Heuristics  or 
polynomial  time  linear  programming  techniques  [42]  might  be  investigated. 

1.  When  applied  to  an  existing  network  where  a  network  topology  is  given,  however,  the  design  approach  developed  in 
this  project  would  suffice.  In  the  design  of  a  survivable  network  management  system,  a  fast  restoration  scheme  should  be 
chosen  with  respect  to  the  economy,  restorability,  applicability,  restoration  speed  and  control  complexity.  The  proposed 
solution  procedures  would  help  this  decision-making  process  in  terms  of  the  first  three  criteria. 


•  Lower  level  QOS:  When  a  load  is  less  than  the  designed  level,  a  solution  to  the  SVPR  problem 
is  not  usually  a  singleton.  Although  any  solution  could  be  selected  from  a  survivability  view¬ 
point,  a  different  VP  assignment  could  lead  to  a  different  quality  of  service  at  a  lower  layer. 
Our  approach,  of  course,  guarantees  that  the  QOS  exceeds  the  designated  level  by  satisfying 
the  VP-level  traffic  demand.  However,  it  would  be  desirable  to  have  some  mechanism  which 
selectively  chooses  a  solution  from  the  optimal  set,  since  this  could  further  improve  the  QOS 
at  call  and  cell  layers. 

•  Initial  feasible  solution  to  the  SVPR  problem:  It  is  possible  that  no  feasible  solution  would 
exist  for  a  given  traffic  demand.  In  such  a  case,  a  part  of  the  demand  must  be  somehow 
rejected.  For  example,  we  must  find  a  flow  assignment  which  minimizes  the  amount  of 
rejected  flow^.  Note  that  although  this  precaution  must  be  installed  in  the  network,  the  above 
situation  would  seldom  happen  if  a  network  is  properly  designed.  Significantly  more  redun¬ 
dant  capacity  is  required  in  order  to  assure  full  restorability  against  any  link  failure  for  a  pro¬ 
jected  traffic  demand.  Therefore,  a  feasible  flow  can  be  obtained  unless  considerably  heavier 
traffic  demand  is  offered  to  the  network  than  the  projected  level  or  unless  multiple  network 
failures  are  present.  We  have  not  encountered  such  a  situation  in  our  experiments  where  the 
load  is  increased  up  to  25  percent  more  than  the  projected  level.  Since  demand  projection  is 
based  on  the  busy  hour  traffic  for  the  forthcoming  design  period  (say  a  year),  a  load  should  be 
typically  100%  RNL  or  less  in  normal  operating  conditions.  Furthermore,  in  long-haul  or 
regional  exchange  networks,  where  precautions  against  a  network  failure  are  highly  necessary, 
a  network  load  is  typically  smoothed  out  since  a  large  amount  of  traffic  is  integrated  there. 
Therefore,  125%  or  more  RNL  would  arise  only  by  overshooting  demand  across  the  entire 
network,  which  would  seldom  happen  in  a  practical  situation. 

•  Transient  phenomena  after  a  failure:  Based  on  the  fast  restoration  scheme,  the  effect  of  a  fail¬ 
ure  would  be  confined  to  affected  virtual  paths.  However,  if  excessive  data  retransmission  is 
necessary,  it  might  also  affect  other  traffic  along  the  restoration  routes  over  some  duration  after 
a  failure.  This  would  happen  if  data  traffic  dominates  over  the  network.  One  possible  solution 
is  to  increase  the  required  bandwidth  over  restoration  paths  in  order  to  accommodate  effective 
bandwidth  during  this  congested  period.  Further  investigation  would  be  necessary  to  deter¬ 
mine  the  (VP-level)  equivalent  bandwidth  on  such  an  occasion. 


2.  We  can  formulate  this  problem  as  an  LP  problem. 
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•  Multiple  projected  demand  patterns:  This  project  considers  a  single  projected  traffic  demand 
pattern  in  the  network  synthesis  problem.  However,  it  would  be  preferable  to  consider  multi¬ 
ple  projected  traffic  demands  in  a  certain  situation.  For  example,  such  a  provision  would  be 
appropriate  for  a  network  covering  a  wide  area  with  multiple  time  zones  because  it  creates 
multiple  busy  hour  traffic  patterns.  Since  this  would  significantly  increase  the  dimension  of  the 
problem,  further  precautions  should  be  devised  to  solve  the  problem  in  a  reasonable  time. 

•  SCFA-KSP-LINE:  A  heuristic  approach  has  been  applied  to  the  SCFA-KSP-LINE  problem. 
Further  research  is  required  to  find  a  better  solution  in  this  problem.  A  Lagrangian  relaxation 
technique  might  be  also  applied  to  this  problem. 
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Appendix  A.  SCFA-MF-LINE:  Problem  formulation 

The  following  notations  are  used  in  the  formulation  of  the  SCFA-MF-LINE  problem^ 
pit  :  A  set  of  all  possible  routes  for  commodity  7i  g  11 . 


RP^ 

:  A  set  of  all  possible  restoration  routes  for  arc  a  g  A. 

X  =  (Vn  G  n,  Vp  G 

p")  :  A  commodity  path  flow  vector. 

r  =  (r“)  (VflG  A,\fpe 

PP^)  :  An  arc  restoration  path  flow  vector. 

z  =  {z^)  (Va  G  A) 

:  A  spare  capacity  vector. 

(Va  G  A) 

:  A  unit  arc  installation  cost.  It  is  assumed  to  be  non-nega¬ 

tive. 

III 

m 

m 

;  A  set  of  all  links  except  for  the  one  containing  arc  a. 

;  An  arc-path  indicator  variable  which  equals  1  if  arc  a  is 

contained  in  path  p,  and  0  otherwise. 

fa  (VaGA) 

;  The  amount  of  a  flow  of  arc  a. 

z[  (ya^A,^leEJ 

:  The  amount  of  necessary  spare  bandwidth  of  arc  a  upon  a 

failure  of  link  /. 

o"  (Vtcg  IT) 

:  The  price  of  commodity  (dual  variable). 

(Vn  G  A) 

:  The  price  of  arc  restoration  (dual  variable). 

{yaeA,'ileE^) 

:  The  price  of  arc  per  failure  (dual  variable). 

Assuming  that  the  network  resource  installation  cost  is  a  linear  function  of  arc  capacity,  the 


1.  Refer  to  Appendix  T  for  the  complete  listing  of  the  notations  used  in  the  report. 
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SCFA-MF-LINE  problem  becomes  a  linear  programming  (LP)  problem.  The  arc-chain  flow  repre¬ 
sentation  [9]  is  employed  on  the  commodity  flow  as  well  as  the  restoration  flow^.  The  SCFA-MF- 
LINE  problem  and  its  dual  problem  (SCFA-MF-LINE-DUAL)  are  formulated  as  follows: 


Minimize 

as  A 

(SCFA-MF-LINE) 

over 

x>0,r>0,z>0 

subject  to  a) 

peP' 

(A-1) 

b) 

2^  -  >  0  Va  e  A,  V/  € 

(A-2) 

c) 

'L  ^l-fa  =  0 

os  RP" 

(A-3) 

^a  =  fa^^a 

4  =  X  X  \p-^l 

^^^ps  RP^ 

Maximize 

X”'  tt  7t 

71  e  n 

(SCFA-MF-LINE-DUAL) 

over 

7t  1  ^ 

a  ,  unrestncted,  u,  >  0 

a  r-Q 

subject  to  i) 

X  ^^a  V7t  6  n 

a  €  p 

(A-4) 

ii) 

X  H-'^a  '^P^  Va  G  A 

(A-5) 

i  6  p:  (/  3  a) 


2.  There  are  two  ways  to  represent  a  flow  in  a  network.  The  arc-node  flow  representation  uses  a  collection  of  arc  flows 
to  represent  a  network  flow,  while  the  arc-chain  flow  representation  employs  a  set  of  path  (and  cycle)  flows  [4]  [9]. 
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A 


iii)  X 


/e  E„ 
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(A-6) 


The  flow  conservation  law  (A-1)  says  that  the  traffic  demand  of  a  commodity  is  satisfied  over 
virtual  paths  for  the  commodity.  The  capacity  constraints  (A-2)  state  that  the  spare  capacity  of 
each  arc  must  be  at  least  equal  to  the  amount  of  the  arc  bandwidth  required  for  restoration  from 
any  single-link  failure.  The  constraints  are  expressed  for  every  combination  of  a  link  failure  event 
and  a  remaining  arc  after  the  failure.  The  full  restorability  constraints  (A-3)  enforce  restoration  of 

7C  / 

all  affected  arc  flow  over  the  restoration  paths.  0  ,  and  are  the  simplex  multipliers  corre¬ 
sponding  to  the  constraints  (A-1),  (A-2)  and  (A-3),  respectively. 


Appendix  B.  SCFA-MF-LINE:  Solution  approach 

The  SCFA-MF-LINE  problem  is  modeled  as  a  large-scale  LP  problem  with  a  special  structure. 
Since  direct  application  of  an  LP  algorithm  to  this  problem  requires  highly  intensive  computation, 
the  following  modifications  are  made.  First  of  all,  the  arc-chain  flow  representation  is  selected 
instead  of  the  arc-node  representation  [9].  The  arc-chain  flow  representation  significantly  lowers 
the  size  of  a  constraint  set,  especially  for  a  large  network.  For  example,  the  size  of  flow  conserva¬ 
tion  constraints  is  reduced  from  |n|  •  |  V|  (0(V^))  to  |n|  {0(V^)).  For  example,  1,210  constraints  is 

reduced  to  110  constraints  for  an  NJ-LATA  network^  and  from  21,168  constraints  to  756  con- 

,  (M,W) 

straints  for  a  US-WAN  network^. 

(a8,io) 

The  column  generation  technique  is  applied  to  accommodate  infinitely  many  path  variables  in 
the  arc-chain  flow  representation  [50].  The  technique  generates  variables  as  needed  during  the 
course  of  the  algorithm  instead  of  listing  all  columns  at  once.  This  strategy  utilizes  the  fact  that 
only  a  few  paths  will  be  actually  used  in  the  optimal  solution.  The  solution  procedure  is  decom- 


3.  A  network  with  n  nodes  and  m  arcs  is  referred  to  as  an  (n,  m)  network  in  this  report. 

4.  It  is  assumed  that  one  commodity  is  defined  for  each  source  and  destination  node  pair  in  this  calculation. 
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posed  into  a  master  process  and  a  sub-process.  The  sub-process  adopts  the  simplex  algorithm  and 
calculates  an  optimal  solution  using  generated  columns.  On  the  other  hand,  the  master  process 
checks  whether  the  obtained  solution  is  globally  optimal  or  not.  The  global  optimality  is  verified  if 
the  dual  feasibility  conditions  (A-4)  and  (A-5)  hold^.  Although  the  conditions  must  be  satisfied  for 
all  commodity  paths  and  all  restoration  paths,  it  suffices  to  examine  their  shortest  paths.  The  con¬ 
dition  (A-4)  is  satisfied  if  the  length  of  the  shortest  commodity  path  of  7i  under  modified  arc  length 
is  not  less  than  o" .  The  condition  (A-5)  holds  if  the  length  of  the  shortest  restoration 
path  of  arc  a  under  metric  { :  ae  I,  be  A)  is  not  less  than  .  Dijkstra’s  algorithm  can  be 
applied  to  find  the  shortest  paths  for  each  commodity  and  for  each  arc.  If  the  dual  feasibility  condi¬ 
tions  are  violated  for  any  shortest  path,  then  the  total  capacity  installation  cost  could  be  reduced  by 
using  this  path.  Therefore,  the  master  process  generates  all  violated  shortest  path  variables^  and 
invokes  the  sub-process.  Otherwise,  the  obtained  solution  is  globally  optimal.  After  the  comple¬ 
tion  of  the  procedure,  a  post-processing  routine  is  called  to  make  the  link  bandwidth  a  multiple  of 
that  of  available  optical  fiber  cables  (round-up  procedure).  The  proposed  solution  procedure  is 
summarized  in  Figure  B-1.  The  initialization  process  generates  only  the  columns  pertaining  to  the 
shortest  commodity  paths  and  the  shortest  restoration  paths  and  obtains  an  initial  feasible  solution. 

The  revised  simplex  method  has  been  widely  used  due  to  its  effectiveness  in  reducing  both  run¬ 
ning  time  and  memory  space  [17]  [54],  and  it  is  applicable  to  the  sub-process.  Each  iteration  of  the 
revised  simplex  method  requires  solving  two  systems  of  linear  equations  to  find  an  entering  col¬ 
umn  vector  and  a  set  of  simplex  multipliers  [17].  The  basis  matrix  is  factorized  into  an  LU  form  in 
order  to  facilitate  the  solution  of  the  linear  systems^.  The  computational  load  of  the  revised  sim¬ 
plex  method  is  dominated  by  the  update  of  the  LU  matrices  and  the  solution  of  two  linear  equa¬ 
tions.  Thus,  the  dimension  of  the  basis  matrix  largely  determines  the  speed  of  the  algorithm.  In  the 
SCFA-MF-LINE  problem,  the  dimension  can  be  to  the  order  of  10^  or  more  for  a  large  network, 
even  with  the  reduction  introduced  by  the  arc-chain  flow  representation.  Since  the  required  com¬ 
putation  for  the  revised  simplex  method  becomes  nontrivial  in  such  a  large-scale  LP  problem,  it  is 
necessary  to  exploit  the  structure  of  the  problem  in  order  to  reduce  the  computational  burden.  The 
following  observations  are  used  to  develop  our  algorithm: 


5.  Note  that  the  condition  (A-6)  and  non-negativity  of  n*  are  assured  at  the  end  of  the  sub-process. 

6.  At  the  end  of  the  sub-process,  the  dual  feasibility  conditions  are  satisfied  for  all  generated  path  variables.  Thus,  if  the 
condition  is  violated  for  a  path  at  the  master  process,  then  this  path  has  not  yet  been  generated. 

7.  This  approach  has  also  been  shown  to  possess  numerical  stability  [54]. 
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Figure  B-1.  Solution  procedure  for  the  SCFA-MF-LINE  problem 


At  the  end  of  any  iteration,  the  following  two  statements  are  true  if  >  0  for  Vrc  €  11 : 

1.  For  each  commodity  %e  IT ,  at  least  one  commodity  path  flow  variable,  {pe  P  ),  is  in 
the  basis. 

2.  For  each  arc  a  e  A ,  it  is  possible  to  maintain  at  least  one  restoration  path  flow  variable, 

{p  e  RP‘^ ),  in  the  basis. 

We  randomly  choose  one  such  basic  variable  from  each  commodity  and  each  arc,  and  call  it  a  key 
flow  variable.  At  each  iteration  the  basis  matrix  B  can  be  arranged  as  follows: 

(B-1) 


where  /  is  an  identity  sub-matrix.  The  first  set  of  columns  corresponds  to  the  key  commodity  path 
flow  variables,  and  the  second  set  is  composed  of  the  key  restoration  path  flow  variables.  The 
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remaining  commodity  and  restoration  path  variables  are  collected  into  the  last  set.  The  third  and 
fourth  sets  of  columns  contain  the  slack  variables  of  the  capacity  constraints  and  the  spare  capacity 
variables,  respectively.  The  first  set  of  rows  represents  the  flow  conservation  law  (Equation  (A-1)), 
and  the  second  set  represents  the  full  restorability  constraints  (Equation  (A-3)).  The  capacity  con¬ 
straints  (Equation  (A-2))  are  collected  and  arranged  in  the  last  three  sets  of  rows  so  that  the  iden¬ 
tity  submatrices  with  proper  sign  can  appear  in  the  place  shown  in  Equation  (B-1).  All  sub¬ 
matrices,  R,  Hi,  Kf,  Mi  (i=l,2,3)  are  sparse.  The  basis  matrix  arranged  in  the  above  manner  is 
called  a  standard  form  in  this  project. 

Using  the  above  matrix  arrangement,  the  LU  matrices  can  be  readily  obtained  as  follows; 


/ 

I  C 

R  I 

I  F 

H,  I 

, 

-I  K,  Y, 

I 

I  Y, 

H,  K,  L 

U 

where  F  =  D-RC,  T.  =  Mi-H.-Ffori  =  1,2,3  2ndZ  =  L-U  =  Y^-K^- Y^.The  major 
computational  task  here  turns  out  to  be  the  factorization  of  the  submatrix  Z  since  the  matrix  multi¬ 
plications  to  obtain  F,  Yi  and  Z  can  be  easily  performed  by  exploiting  the  problem  structure.  The 
dimension  of  Z  is  identical  to  the  number  of  non-key  flow  variables  which  is  significantly  smaller 
than  that  of  the  original  matrix,  B.  According  to  our  computational  experiments,  the  dimension  of 
Z  is  only  up  to  an  eighth  to  a  fourth  of  dim(B),  resulting  in  a  large  reduction  of  memory  space  and 
execution  time.  The  algorithm  runs  about  30  times  faster  than  the  method  which  factorizes  the 
basis  matrix  at  each  iteration*. 

By  taking  advantage  of  the  similarity  between  two  successive  submatrices,  we  can  further  econ¬ 
omize  the  computation.  The  LU  submatrices  can  be  directly  updated  instead  of  factorized  from 
scratch  at  each  iteration.  Depending  on  the  type  of  the  entering  and  leaving  variables,  the  pivoting 
operation  can  be  categorized  into  four  classes: 

Class  l)The  entering  and  leaving  variables  are  both  slack  variables,  and  the  capacity  con¬ 
straint  corresponding  to  the  entering  slack  variable  is  in  the  fifth  set  of  rows.  In  this  case,  a 


8.  This  speed-up  is  based  on  the  experiments  on  the  NJ-LATA  sample  network  shown  in  Figure  3-4,  “Sample  networks 
1.  Real  networks,”  on  page  41.  Further  speed-up  is  expected  for  a  large  network. 
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row  of  H3,  Kj  and  M5  changes  in  the  same  position,  resulting  in  a  single  row  update  in  the 
submatrix  Z  (Figure  B-l-a). 

Class  2)A  commodity  or  restoration  path  flow  variable  enters  and  a  non-key  flow  variable 
leaves.  In  this  case,  the  new  submatrix  Z  differs  from  the  previous  one  only  in  a  single  col¬ 
umn  because  a  column  update  of  C,  D,  Mj,  M2  and  Mj  occurs  in  the  same  position  (Figure 
B-l-b). 

Class  3)The  entering  variable  is  a  flow  variable  and  the  leaving  variable  is  a  slack  variable. 
The  dimension  of  the  submatrix  increases  by  one  in  this  case.  The  update  can  be  arranged  so 
that  a  new  column  is  added  to  C,D,Mj  and  M2  in  the  last  column.  A  new  row  is  appended  to 
and  in  the  last  row,  and  a  new  row  and  a  new  column  are  added  to  the  submatrix  Afj  in 
the  corresponding  position.  This  results  in  attaching  a  single  row  and  column  to  the  previous 
submatrix  Z  (Figure  B-l-c). 

Class  4)In  other  cases,  no  explicit  rule  exists  to  realize  a  direct  LU  update. 

A  direct  LU  update  is  possible  for  the  first  three  classes,  and  further  improvement  on  the  com¬ 
putation  time  will  be  attained  if  these  three  types  of  pivoting  are  dominant.  Table  B-1  shows  the 
frequency  of  pivoting  for  each  class  based  on  our  experiments.  More  than  75  %  of  pivot  operations 
fall  into  the  above  three  categories,  and  we  could  thus  double  the  speed  of  the  computation.  The 
LU  factorization  routine  is  invoked  at  18  to  25  percent  of  the  iterations,  and  they  are  evenly  dis¬ 
tributed.  Thus,  the  refactorization  prevents  a  round-off  error  from  propagating  over  the  iterations 
and  leads  to  a  numerically  stable  operation. 

The  Bartels  and  Golub’s  decomposition  method  [12]  [67]  is  applied  in  the  first  two  classes.  The 
LU  update  procedures  for  Class  1  are  summarized  in  the  following  passage.  Class  2  can  be  han¬ 
dled  in  a  similar  fashion. 

1.  First,  the  new  submatrix  is  factorized  into.  L-U  as  shown  in  Figure  B-2  (a),  where 
L^^^  and  U^i^  are  the  submatrices  L  and  U  before  the  basis  update.  U  =  U^^^,  while  L 
differs  from  L^^^  only  by  a  single  row  as  shown  in  the  figure. 

2.  A  cyclic  row  permutation  on  the  matrix  L  gives  a  lower  Hessenberg  matrix  H  as  shown  in 
Figure  B-2  (b).  In  a  matrix  form,  H  =  P  L  where  P  is  a  permutation  matrix. 

3.  Using  elementary  column  operations,  the  matrix  H  can  be  transformed  into  a  lower  triangu¬ 
lar  matrix  L^^^^  as  L^^^  =  L  -  M  where  M  is  an  elementary  matrix. 

4.  An  upper  triangular  matrix  U^^^  =  M~^  •  U . 
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Now  the  LU  decomposition  of  the  permuted  submatrix  P  •  is  obtained  as  and  .  It 
may  be  necessary  to  interchange  adjacent  columns  for  pivoting  during  the  third  step,  and  this  can 
be  accomplished  by  a  slight  modification  of  the  above  procedure. 

As  for  Class  3,  the  following  procedures  yield  a  new  LU  decomposition: 

1.  First,  the  new  submatrix  is  factorized  into  L  -  U  as  shown  in  Figure  B-3  (c). 

2.  Using  elementary  row  operations,  U  can  be  transformed  into  an  upper  triangular  matrix 
^new  •  ^  ^  matrix  form,  =  M  •  U  where  M  is  an  elementary  matrix  with  a  special 
structure  as  shown  in  Figure  B-3  (d).  No  row  permutation  is  necessary  in  this  case. 

3.  A  lower  triangular  matrix  =  L  ■  M~^ .  It  has  a  closed  form  as  shown  in  Figure  B-3  (d) 


Note  that  the  worst  case  complexity  of  the  proposed  algorithm  is  exponential  since  the  simplex 
algorithm  is  employed  in  the  sub-process  [17].  However,  the  running  time  is  considerably  reduced 
by  the  proposed  mechanisms  to  a  level  where  the  algorithm  is  applicable  to  any  network  with  a 
practically  reasonable  size.  Based  on  our  experiment  using  the  DEC  Alpha-station  200  4/233^,  the 


a)  Class  1  b)  Class  2 

Figure  B-2.  The  structure  of  a  new  submatrix  Z 
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Class  1 

7.3  ~  36.3% 

(29.2%) 

Class  2 

17.5  -  46.0% 

(26.8%) 

Class  3 

17.3  -  29.2% 

(20.7%) 

Class  4 

17.5  -  25.3% 

(23.3%) 

Table  B-1.  The  frequency  of  pivoting  by  category 

Two  sample  networks,  the  NJ-LATA  and  US-WAN  networks,  are  used  in  the 
experiments  (see  Figure  3-4,  “Sample  networks  1.  Real  networks,”  on  page  41).  Ten 
randomly  generated  demands  are  employed  for  each  network  model. 
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total  computing  time  is  less  than  10  minutes  for  the  NJ-LATA  network  and  about  an  hour  for  the 
US-WAN  network^®. 


Appendix  C.  SCFA-MF-LINE:  Validity  of  the  algorithm 


This  section  verifies  the  two  claims  which  have  been  used  in  the  development  of  our  algorithm. 
The  first  claim  is  that  a  basis  matrix  can  be  arranged  in  a  standard  form  in  all  iterations  of  the  sub¬ 
process.  The  second  claim  is  that  a  well-known  shortest  path  algorithm  can  be  applied  to  the  dual 
feasibility  tests  in  the  master  process.  Note  that  the  column  generation  algorithm  guarantees  the 
termination  of  algorithm  at  the  optimum  in  a  finite  number  of  iterations  if  the  generated  columns 
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Figure  B-3.  Direct  LU  update  operations 


9.  This  alpha  station  runs  at  233  MHz  (21064A)  It  has  SPECint  157.6  and  SPECfp  183.8.  A  512  KB  cache  and  a  64  MB 
memory  are  installed. 

10.  See  Section  3.4.2.  for  details  on  the  settings  of  the  experiments. 
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are  not  removed  [50]. 


Lemma  C-1 

Assume  q^>0  for  Vti  e  IT ,  and  suppose  that  a  positive  iiow  is  assigned  for  any  arc  in  fhe 
algorithm.  Then,  a  basis  matrix  B  can  be  arranged  as  in  Equation  (B-1)  in  all  iterative  steps 
of  the  sub-process. 

(proof) 

Since  >  0  for  Vtc  e  IT ,  at  least  one  commodity  path  flow  variable  must  always  be  in  the 
basis  for  each  commodity.  Similarly,  based  on  the  positivity  assumption  of  an  arc  flow,  at  least 
one  restoration  path  flow  must  be  in  the  basis  for  each  arc.  Therefore,  it  is  possible  to  choose  a 
key  flow  column  for  each  commodity  and  for  each  arc  which  constitutes  the  first  and  second 
sets  of  columns.  The  last  set  of  columns  has  no  restriction,  and  any  non-key  flow  variables  can 
be  collected  there  in  any  order.  Let  denote  the  slack  variable  of  the  capacity  constraint  for 
arc  a  against  a  failure  of  link  /.  The  placement  of  all  basic  slack  variables  in  the  third  set  of  col¬ 
umns  requires  that  the  corresponding  capacity  constraints  be  in  the  third  set  of  rows.  Now, 
place  a  basic  spare  capacity  variable,  ,  in  the  fourth  set  of  columns.  This  requires  that  some 
capacity  constraint  for  arc  a  be  arranged  in  the  corresponding  row  in  the  fourth  set.  Therefore, 
a  slack  variable,  ,  must  be  non-basic  for  some  E^.  Assume  otherwise.  Suppose  3a  g  A 
such  that  z  and  s  for  all  /  e  £  are  in  the  basis.  Let  B  be  a  sub-matrix  of  B  composed  of 
the  columns  z^  and  for  V/ g  E^.  Although  B^  is  composed  of  +  1  columns, 
dim  (B^)  =  |£^| .  Thus,  the  column  vectors  of  B^  are  linearly  dependent.  This  contradicts 
the  fact  that  the  column  vectors  of  B^  are  a  part  of  the  basis. 

(Q.E.D.) 

Note  that  the  first  assumption  is  not  restrictive  since  a  commodity  with  no  demand  can  be  elimi¬ 
nated  from  the  set  11  in  the  problem  formulation.  The  second  assumption  requires  that  any  arc  has 
a  positive  flow,  which  usually  holds  for  any  properly  designed  network.  The  assumption  is  vio¬ 
lated  only  if  an  arc  is  unnecessarily  installed  between  nodes  where  no  load  is  offered  due  to  a  lack 
of  demand.  Furthermore,  the  following  Lemma  suggests  that  it  is  possible  to  maintain  at  least  one 
restoration  flow  variable  in  the  basis  even  if  an  arc  flow  becomes  zero  at  some  point  in  the  sub-pro¬ 
cess.  Consequently,  the  second  assumption  is  not  necessary. 
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Lemma  C-2 


Suppose  that  is  the  only  restoration  flow  variable  for  arc  a  in  the  basis,  and  it  is  selected 
as  a  leaving  variable  at  an  iteration  in  the  sub-process.  Furthermore,  suppose  that  the 
entering  variable  is  not  a  restoration  flow  variable  for  arc  a.  Then  there  always  exists  another 
candidate  than  that  will  leave  the  basis. 

(proof) 

Let  B  be  the  current  basis  matrix  before  pivoting,  m  be  the  dimension  of  5  (m  =  |B| ),  and  n 
be  the  number  of  all  generated  variables.  Suppose  that  the  ^-th  column  vector  (q>m)  is 
selected  as  an  entering  variable,  and  the  leaving  variable,  ,  is  the  r-th  column  vector  (r  <  m ) 
before  pivoting.  The  following  notations  are  used  throughout  the  proof. 

a  =  (a.  )  (i  =  1,  ...,m)  :  The  column  vector  corresponding  to  the  entering  vari- 

able. 

b  =  (b^  (/  =  1,  ...,m)  :  The  right-hand-side  column  vector. 

a  '  =  =  1,  .,.,m)  :  a '  =  -a  where  5  is  the  basis  matrix  before  pivot- 

(J  ICf  ^  ^ 

ing. 

b'  =  (bp  (i  =  1,  ...,w)  :  Solution  vector  before  pivoting.  Namely,  b' s  5  ^-b 

y  =  (y,-)  (i  =  1, 


X^=  {Xp-.ae  p.x"  is  in  the  basis  before  pivoting  for  Vit  e  11} 

Namely,  is  a  set  of  all  basic  commodity  path  flow 
variables  that  go  through  arc  a  before  pivoting. 


Solution  vector  after  pivoting,  using  as  the  leaving 
vEuiable.  The  value  is  given  by: 

V/V  (i  =  ,) 

y.  =  •  b/  -  (a  •  yp  (i<m,i^  r) 

Q  (i>m,i^q,i  =  r) 
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Case  1)  When  X"  56  0 


All  basic  variables  in  become  zero  after  pivoting  since  the  only  basic  restoration  path 
variable  of  arc  a  leaves  the  basis. 


Case  1-a)  When  b^'>0. 

Since  b^'>0,  there  exists  at  least  one  commodity  flow  variable  in  X^  such  that  it  takes  a 

JC 

positive  value  before  pivoting.  Suppose  that  is  such  a  variable,  and  it  is  in  the  k-th  column 
of  the  basis  (b^,'  >0).  >  0  since  the  r-th  variable  is  selected  as  a  leaving  variable.  Thus, 

we  have  >  0 .  Since  all  basic  variables  in  X^  becomes  zero  after  pivoting, 

yk  =  h'-^^kq-yq'i  =0 


This  together  with  the  positivity  of  b,'  and  y  gives 

K  q 


Furthermore,  we  can  derive 


=  K^^rq  =  h'^Hq 


(C-1) 


(C-2) 


From  (C-1)  and  (C-2),  we  can  conclude  that  is  an  alternative  leaving  variable. 

Case  1-b)  When  b/  =  0, 

Since  =  0 ,  all  variables  in  X^  must  be  zero  even  before  pivoting.  Let  be  a  set  of  the 
column  indices  for  all  veuiables  in  X^ .  Assume  that  there  is  no  other  choice  for  a  leaving  vari¬ 
able.  This  implies  for  V/:  €  /* .  Otherwise,  a  variable  violating  this  condition  can 

leave  the  basis  since 

yq  =  K'^^rq  =  h'^^^kq  =  0  fOT  k  G  f 

Now  consider  the  following  linear  system  which  is  composed  of  all  basic  variables  and  the 
entering  variable  (the  ^-th  column). 
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(C-3) 


Iq  q 

+ 

2q  q 


^m-^^mq^q  = 


The  full  restorability  constraint  for  arc  a  can  be  rewritten  in  terms  of  x.  as  follows: 


^r-  X  ^k-^a-^q  =  ^ 
ite  f 


if  the  entering  column  is  the  commodity  flow  variable 

and  its  path  contains  arc  a 

otherwise 


(C-4) 


Then,  any  solution  of  the  linear  system  (C-3)  must  satisfy  (C-4),  since  the  former  is  the  system 
transformed  from  the  set  of  the  constraints  of  the  SCFA-MF-LINE  problem,  including  Equa¬ 
tion  (C-4).  Now  let  =  5  >  0 .  Since  >  0  and  <  0  for  VA:  e  /  , 

=  -%'5<0,and 

However,  this  violates  Equation  (C-4)  and  leads  to  a  contradiction.  Therefore,  there  is  at  least 
one  variable  in  with  column  index  A:  (A:  e  /^  )  such  that  >  0 ,  and  we  can  select  this 
variable  as  a  leaving  variable. 


Case  2)  When  =  0 

The  only  restoration  flow  variable  of  arc  a  in  the  basis  becomes  non-basic.  Thus,  the  enter¬ 
ing  variable  must  be  a  commodity  flow  variable  whose  path  includes  arc  a.  Otherwise,  the  row 
vector  of  the  basis  matrix  corresponding  to  the  full  restorability  constraint  of  arc  a  becomes 
zero.  Thus,  the  resulting  column  vectors  of  the  basis  matrix  do  not  form  the  basis.  Since  all 
restoration  flow  variables  for  arc  a  are  out  of  the  basis  after  pivoting,  =  0.  Thus, 

b  '  =  y  a  '  =  0 .  Substitute  x  =  5  >  0  in  the  system  (C-3)  as  before.  This  yields 
X  =  b  ' -a  'x  =  -a  '5 < 0 .  However,  this  violates  Equation  (C-4).  Therefore,  the  only 

r  r  rQ  ^  tq 

basic  restoration  flow  variable  of  arc  a  never  leaves  the  basis  when  X^  =  0 . 

(Q.E.D.) 


- 117- 


An  initial  basis  includes  one  restoration  path  variable  per  arc.  Thus,  the  situation  described  in 
Lemma  C-2  is  the  only  time  when  all  the  restoration  path  variables  of  a  certain  arc  become  non- 
basic.  The  Lemma  assures  that  the  selected  leaving  variable  can  be  replaced  in  such  a  case.  Note 
that  this  situation  seldom  happens  in  practice,  as  discussed  earlier,  and  has  never  been  encountered 
in  our  experiments.  The  following  concluding  theorem  on  a  basis  matrix  arrangement  is  a  direct 
consequence  of  Lemma  C-1  and  Lemma  C-2. 

Theorem  C-1 

Assume  >  0  for  Vtt  €  n .  Then,  at  each  iteration  of  the  sub-process,  it  is  possible  to 
maintain  a  standard  form  of  the  basis  matrix  B  by  selectively  choosing  a  leaving  variable. 

The  following  Lemma  ensures  that  a  well-known  shortest  path  algorithm  can  be  applied  in  the 
dual  feasibility  tests  related  to  the  commodity  flow  variables  (Lemma  C-3-i)  as  well  as  the  restora¬ 
tion  flow  variables  (Lemma  C-3-ii). 

Lemma  C-3 

At  the  end  of  the  sub-process,  the  following  two  statements  are  true: 

i)  No  negative  arc  exists  in  the  network  G  with  arc  cost  {d^  +  w^}  . 

ii)  For  V/  e  £ ,  no  negative  arc  exists  in  the  network  G\  { /}  with  arc  cost  { }  • 


(proof) 

Obviously  ^  0  ( Va  e  A,yie  E^)  at  the  end  of  the  subprocess  because  all  dual  feasibil¬ 
ity  conditions  are  assured  except  for  those  related  to  non-generated  flow  variables.  Let 
(a,  vv,  p)  denote  a  row  vector  of  simplex  multipliers  arranged  in  the  order  of  corresponding 
constraints  in  the  standard  form  of  the  current  basis  matrix,  B.  Then,  the  equality 
(di,  0,  dj)  =  (c,  w,  ji)  B  holds,  where  dj  and  d3  are  the  cost  vectors  of  the  corresponding 
basic  variables.  The  second  equation,  0  =  w  •  B ,  is  equivalent  to  w  =  — ji/f 
(H  =  (H j,  H^,  H^)  * ),  and  this  leads  to  ^  “  >  0  for  Va  g  A ,  where  p  is  the 

key  restoration  path  for  arc  a.  Thus,  d^  +  w^>0  for 'iae  A. 
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Appendix  D.  Extension  to  SCFA-KSP-LINE 


A  heuristic  algorithm  is  developed  to  find  a  fully  restorable  capacity  assignment  for  a  KSP- 
based  system.  It  uses  the  optimal  solution  of  the  SCFA-MF-LINE  problem  as  a  starting  point.  The 
KSP  restoration  usually  requires  slightly  more  spare  bandwidth  than  the  MF  restoration.  The  pro¬ 
posed  heuristic  detects  the  amount  of  unrecoverable  flow  and  gradually  adds  capacity  over  the 
shortest  restoration  path  by  iteratively  invoking  the  following  loop  until  full  restorability  is 

attained: 

1.  Let  m”  and  be  the  amount  of  unrecoverable  flow  of  arcs  a  and  p  against  a  failure  of  Imk 

/  (/  =  {a.  p}  ).  Calculate  {e  •  z[}  for  e  e  (0,1]  where  is  given  by: 

Z„  =  Q  a  ■  ®  P  ■  ’ 

^  a,  p  a,  p 

p“  and  are  the  shortest  restoration  path  for  arcs  a  and  P,  respectively,  z^  gives  the  nec¬ 
essary  additional  capacity  of  arc  a  in  order  to  restore  the  unrecoverable  flow  due  to  a  failure 
of  link  I  over  the  shortest  restoration  route. 

2.  For  Vae  A,  add  capacity  by  Ac^ . 

3.  Invoke  the  round-up  procedure. 

Fully  restorable  capacity  assignment  is  obtained  in  one  iteration  if  the  incremental  rate,  e,  is  set  to 
one.  However,  the  additional  spare  capacity  over  the  shortest  restoration  routes  for  certain  arcs 
could  increase  the  spare  capacity  of  the  k-th  {k^l)  shortest  route  of  other  arcs.  On  such  occa¬ 
sions,  the  spare  capacity  can  be  sheu'ed  with  different  failure  scenarios,  and  the  required  amount  of 
additional  spare  capacity  would  decrease.  If  e  is  set  to  one,  however,  the  heuristic  cannot  attain  the 
cost  reduction  by  spare  capacity  sharing.  Thus,  a  smaller  value  of  E  would  yield  a  less  expensive 
assignment  with  a  slightly  higher  number  of  iterations.  Our  preliminary  experiments  have  shown 
that  e=0.1  is  small  enough  to  acquire  the  least  expensive  capacity  allocation  with  this  heuristic 
algorithm.  This  post  process  routine  typically  terminates  in  tens  of  seconds  with  a  few  iterations. 

The  KSP  rerouting  performs  restoration  path  hunting  through  message  flooding.  In  order  to 
complete  the  restoration  in  a  short  period,  it  does  not  usually  seek  lengthy  alternate  routes,  while 
such  routes  could  be  selected  in  the  MF  rerouting  algorithm.  In  addition,  a  hop-limit  is  typically 
imposed  on  a  restoration  route.  Therefore,  the  solution  of  the  SCFA-MF-LINE  problem  might  not 
be  appropriate  as  a  starting  point  of  the  proposed  post-processing  routine.  In  order  to  conform  to 
more  realistic  situations,  the  algorithm  is  slightly  modified  as  follows:  First,  all  candidate  restora- 
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tion  routes  are  enumerated,  and  their  columns  are  generated  at  the  beginning  of  the  procedure.  In 
the  master  process,  all  procedures  related  to  restoration  flow  variables  are  skipped,  and  only  the 
dual  feasibility  conditions  corresponding  to  commodity  flow  variables  (namely.  Equation  (A-4)) 
are  examined.  If  necessary,  a  commodity  flow  variable  is  generated  in  the  master  process,  but  not  a 
restoration  flow  variable.  The  post-processing  routine  uses  the  solution  of  this  revised  algorithm  as 
its  initial  solution.  Due  to  the  restriction  on  the  restoration  path  variables,  significant  improvement 
on  the  computation  time  is  achieved  by  this  modification.  Our  experiment  indicates  that  the 
required  CPU  time  is  reduced  by  an  order  of  magnitude  for  a  US-WAN  network.  It  is  also 
observed  that  the  cost  increase  is  only  2  to  10  percent  over  MF-based  networks  with  restoration 
path  restriction. 


Appendix  E.  SCFA-MF-ETE:  Problem  formulation 


The  SCFA-MF-ETE  searches  for  an  optimal  capacity  and  flow  assignment  for  the  self-healing 
ATM  networks  based  on  end-to-end  restoration.  In  the  problem  formulation,  the  full  restorability 
constraints  must  be  revised  due  to  the  different  restoration  strategies,  while  the  others  remain 
almost  identical  to  that  of  the  SCFA-MF-LINE  problem.  In  end-to-end  restoration,  each  commod¬ 
ity  selects  its  rerouting  paths  independently  between  the  respective  source  and  destination  nodes. 
Thus,  one  full  restorability  constraint  is  required  for  every  combination  of  a  commodity  %  and  a 
failed  link  /,  which  is  referred  to  as  a  (Jt,/)  constraint  or  a  (7t,0  row  in  this  project.  The  following 
additional  notations  are  employed  in  the  formulation  of  the  SCFA-MF-ETE  problem^ 

’  :  A  set  of  all  possible  restoration  routes  for  commodity 

7CG  n  against  a  failure  of  link  l  e  E.  Its  element  is 
called  a  (tc,/)  restoration  path. 

I  =  (.fp  )  :  A  restoration  path  flow  vector,  kg  n,  /  g  E,  pe  RP^’  ^ 

^a,  l,p  •  ^  indicator  variable  which  equals  1  if  both  arc  a  and 

link  /  are  contained  in  path  p,  and  0  otherwise. 

1 1 .  Refer  to  Appendix  T  for  a  complete  listing  of  notations. 
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r  Va  €  A, '^l€  :  The  amount  of  bandwidth  of  arc  a  which  is  released  by 

affected  VP’s  due  to  a  failure  of  link  1. 

w^’  Wtc  6  n,  V/  e  £  :  The  price  of  link  restoration  per  commodity  (dual  vari¬ 

able). 

Now  the  SCFA-MF-ETE  problem  and  its  dual  problem  (SCFA-MF-ETE-DUAL)  are  formulated 
as  follows: 


Minimize  D (x, r, z)  =  ^  d^- 

ae  A 

over  x^0,r>0  z^O 

subject  to  a)  Vtc  e  11 


(SCFA-MF-ETE) 


b)z^  +  r^-z^^0  \fae.A,yi&E^ 


I  s  p  ~  ^  Vtt  6  n,  V/  e  E 


peRP’^' 

fa'*' 


fa=  E  S  ®a.p-^p 


Jt  e  n  „  c  p* 


4  -  I  S  Kp-’;' 

^peRP”  ' 

'•1=  E  E  K.i.p-^p 


itenpgp* 


subject  to  i) 


^  {d^  +  w^’  -  S  E  VpeP^VTcelT 

ae  p:  {1 3 a)  le  pas  p 

ii)  ^  ^  ^  V/j  e  /?p"’  V;t  G  n,  V/  e  £  f£-5; 

a  e  p 

«i)  X 

l^Ea 


The  constraints  (E-1),  (E-2)  and  (E-3)  correspond  to  the  flow  conservation  law,  the  capacity  con¬ 
straints,  and  full  restorability  constraints,  respectively.  Note  that  the  end-to-end  restoration  scheme 
releases  the  bandwidth  of  the  affected  VP’s  over  their  original  routes.  The  released  capacity  can  be 
reclaimed  for  the  restoration  of  any  affected  traffic.  This  phenomenon  is  embodied  in  the  second 

term  of  the  capacity  constraints  which  express  the  amount  of  the  reusable  bandwidth,  ,  n  and 
n  I  ^ 

w  '  are  the  simplex  multipliers  corresponding  to  the  constraints  (E-1),  (E-2)  and  (E-3),  respec¬ 
tively. 


Appendix  F.  SCFA-MF-ETE;  Solution  approach 

An  approach  similar  to  that  for  the  SCFA-MF-LINE  problem  can  be  applied  to  the  SCFA-MF- 
ETE  problem.  The  arc-chain  flow  representation  is  employed  to  simplify  the  constraint  structure, 
and  the  column  generation  approach  is  applied  to  handle  the  enormous  number  of  path  variables 
due  to  the  arc-chain  flow  representation.  The  solution  procedure  is  decomposed  into  a  master  pro¬ 
cess  and  a  sub-process.  The  computational  load  of  the  simplex  procedure  in  the  sub-process  is  con¬ 
siderably  lowered  by  exploiting  the  special  structure  of  the  problem.  The  following  two  statements 
are  true  at  the  end  of  any  iteration  if  >  0  for  Vji  g  IT . 

1.  For  each  commodity  ne  IT ,  at  least  one  commodity  path  flow  variable,  x"  (p  g  P^),  is  in 
the  basis. 

2.  For  each  combination  of  71  g  IT  and  I  e  E,  it  is  possible  to  keep  at  least  one  restoration  path 
flow  variable,  /’  ‘  (p  e  '),  in  the  basis. 
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Then  the  basis  matrix,  B,  can  be  arranged  as  shown  below,  and  it  can  be  readily  factorized  into  an 


LU  form: 


B  = 


I 

R  / 


Aj  ^2 
Aj 


c 

/ 

I  c 

D 

R  / 

I  F 

K,  M, 

A,H,  / 

• 

-I  K,  Y, 

I 

A2//2  / 

/  y^ 

A3  7/3  L 

0 

c,  y,  = 

-H.  F-A.  C  for 

i  =  1, 2, 3  and  Z 

(F-1) 


'  •  Y 
.3  ij' 


The  arrangement  of  the  rows  and  columns  is  consistent  with  that  in  Equation  (B-1).  The  stmcture 
is  almost  identical  except  for  the  introduction  of  A,-  (/=1,2,3),  which  comes  from  the  second  term 


of  Equation  (E-2).  All  sub-matrices,  R,  A,-.  and  M,-  (/=1,2,3)  are  sparse.  We  again  call  the 

basis  matrix  arranged  in  this  manner  a  standard  form.  A  direct  update  of  the  tU  submatrices  can 
be  carried  out  in  the  same  fashion  as  discussed  in  the  previous  section. 


Additional  mechanisms  must  be  devised  to  tackle  the  new  issues  inherent  to  the  SCFA-MF-ETE 
problem.  First  of  all,  the  number  of  full  restorability  constraints  grows  enormously,  from  lAI 
(_0(A))  in  the  SCFA-MF-LINE  problem  to  \E\  •  |n|  (O(AV^)  in  the  SCFA-MF-ETE  problem  (i.e. 
from  46  to  2,530  for  an  NJ-LATA  network,  and  from  90  to  34,020  for  a  US-WAN  network). 
Although  the  size  of  the  submatrix  Z  euid  the  computation  of  L(J  factorization  remain  the  same, 
the  revised  simplex  procedure  significantly  slows  down  in  the  computation  of  an  entering  column 
vector  and  simplex  multipliers  at  each  iteration.  The  end-to-end  restoration  scheme  requires  that 
its  full  restorability  constraints  be  expressed  for  all  combinations  of  conmiodities  and  link  failure 
scenarios.  However,  the  (tc,  /)  constraint  is  necessary  only  if  at  least  one  generated  commodity 
flow  for  n,  ,  goes  through  the  link  /.  The  (7t,  1)  constraint  is  defined  active  if  it  satisfies  the  above 
condition  and  is  called  inactive  otherwise.  The  inactive  (7t,  /)  constraints  do  not  have  any  effect  on 
the  sub-process  since  the  value  of  their  related  basic  variables,  ,  stays  zero  in  the  entire  opera¬ 
tion  of  the  sub-process.  Since  each  commodity  is  expected  to  be  transferred  over  only  a  fraction  of 
a  network,  the  number  of  the  constraints  can  be  largely  reduced  by  generating  only  active  (n,  1) 
constraints. 


The  master  process  generates  a  (Jt,  /)  row  when  it  becomes  active  due  to  newly  generated  com¬ 
modity  flow  variables  of  n.  In  addition,  it  deletes  non-basic  commodity  and  restoration  flow  vari- 
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DONE 


Figure  F-1.  Solution  procedure  for  the  SCFA-MF-ETE  problem 

ables  which  have  not  been  referred  to  for  some  period.  This  decreases  the  number  of  non-basic 
variables  involved  in  the  pricing-out  operation  [54]  in  the  subprocess.  More  importantly,  the 
dimension  of  the  basis  matrix  can  be  reduced.  A  (ti,  1)  row  can  be  removed  if  it  turns  inactive  as  a 
result  of  the  deletion  of  a  non-basic  commodity  flow  variable.  Note  that  this  column  deletion  pro¬ 
cedure  can  be  performed  only  when  the  value  of  the  objective  function  has  improved  in  the  previ¬ 
ous  sub-process.  This  condition  is  required  to  guarantee  the  termination  of  the  algorithm  in  a  finite 
number  of  iterations.  Figure  F-1  summarizes  the  proposed  solution  procedure.  The  initialization 
process  generates  only  the  shortest  commodity  path  variables,  active  full  restorability  constraints, 
and  the  shortest  restoration  path  variables  for  the  active  constraints. 

The  major  role  of  the  master  process  is  to  test  the  dual  feasibility  conditions  and  to  determine 
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whether  the  solution  obtained  in  the  sub-process  is  globally  optimal  or  not.  Since  the  dual  feasibil¬ 
ity  condition  (E-6)  and  non-negativity  of  |i^  are  already  established  at  the  end  of  the  sub-process, 
only  the  conditions  (E-4)  and  (E-5)  must  be  checked.  As  in  the  SCFA-MF-LINE  problem,  it  is  suf¬ 
ficient  to  examine  a  set  of  shortest  paths  to  verify  the  global  optimality.  However,  a  shortest  path 
algorithm  cannot  be  applied  to  the  test  on  (E-4).  This  condition  is  assured  if  the  length  of  the  short¬ 
est  path  of  commodity  n  with  the  modified  arc  cost  (d^  +  w  ’  )  and  the  mutual  arc  cost 

is  not  less  than  .  The  mutual  arc  cost  is  imposed  if  a  path  goes  through  both  arc  a  and 
fink  /.  A  quadratic  shortest  path  (QSP)  algorithm,  which  is  described  in  Appendix  Q,  is  developed 
to  obtain  the  shortest  path  of  this  type.  As  to  the  condition  (E-5),  a  traditional  shortest  path  algo¬ 
rithm  can  be  adopted.  This  condition  is  satisfied  if  the  length  of  the  shortest  path  pe  RP  ’  under 
modified  arc  length  |l^  in  the  network  G\  {  /}  is  not  less  than  w  .  If  the  dual  feasibility  condi¬ 
tions  are  violated,  then  the  master  process  generates  all  violated  shortest  paths  and  invokes  the 
sub-process.  Otherwise,  the  current  solution  is  globally  optimal. 

In  the  above  dual  feasibility  test,  all  simplex  multipliers  must  be  at  hand  including  the  ones  cor¬ 
responding  to  non-generated  full  restorability  constraints.  The  multiplier  for  the  generated  rows 
can  be  obtained  by  d  •  fl"*  where  a  vector  d  contains  the  cost  vector  corresponding  to  the  current 
basis.  As  for  a  non-generated  (ti,  /)  constraint,  its  corresponding  simplex  multiplier,  w”’  ,  can  be 
obtained  by  V  ,  where  p  is  the  shortest  path  for  commodity  7t  in  the  network  G\  { /}  with 

€  p  ^  . 

the  modified  arc  cost  .  It  can  also  be  shown  that  the  row  generation  and  deletion  procedures  in 
the  master  process  will  not  change  the  value  Z  =  L-  U  in  Equation  (F-1).  Therefore,  it  is  not  nec¬ 
essary  to  re-factorize  Z  at  the  end  of  the  master  process,  and  the  LU  factorization  for  a  new  set  of 
constraints  can  be  trivially  obtained. 

The  execution  time  of  the  proposed  algorithm  is  largely  reduced  by  the  above  mechanisms, 
although  more  time  is  required  than  in  the  case  of  the  SCFA-MF-LINE  problem.  In  our  experi¬ 
ment^^,  the  execution  time  ranges  from  an  hour  to  a  day  depending  on  network  size.  Since  the 
algorithm  is  meant  for  a  network  planning,  the  execution  time  is  not  a  critical  issue  unless  it  takes 
unreasonably  long,  say  a  month  or  more. 


12.  Refer  to  Section  3.4.2.  for  details  on  the  settings  of  the  experiments. 
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Appendix  G.  SCFA-MF-ETE:  Validity  of  the  algorithm 


This  section  shows  the  validity  of  the  claims  which  have  been  employed  in  the  development  of 
our  algorithm.  First  of  all,  we  show  that  a  basis  matrix  can  be  arranged  in  a  standard  form  at  any 
iteration  of  the  sub-process. 

Lemma  G-1 

Consider  an  active  (tc,  t)  constraint.  Suppose  that  ^  is  the  only  (tt,  /)  restoration  path 
variable  in  the  basis  and  is  selected  as  a  leaving  variable  at  some  point  in  the  sub-process. 
Furthermore,  suppose  that  the  entering  variable  is  not  a  (n,  /)  restoration  path  variable.  Then, 
we  can  always  find  a  candidate  for  a  leaving  variable  other  than  r^’  ^ . 

(proof) 

Let  p,  jc”  is  in  the  basis  before  pivoting  }  Then,  the  assertion  can  be 

proved  in  the  same  way  as  in  the  proof  of  Lemma  C-2  with  replaced  by  X^'  ^ .  Namely,  it 
can  be  shown  that  some  commodity  flow  variable  in  x"’  ^  can  leave  the  basis  when  x’^’  ^  is 
non-empty,  and  that  ^  is  not  selected  as  a  leaving  variable  when  x”’  ^  is  empty. 

(Q.E.D.) 

An  initial  basis  includes  one  restoration  path  variable  per  active  restoration  path.  When  a  (K,  /) 
row  is  generated  in  the  master  process,  a  corresponding  restoration  path  variable  is  also  generated. 
Thus,  the  situation  described  in  Lemma  G-1  is  the  only  occasion  when  all  (7i,  /)  restoration  path 
variables  could  be  removed  from  the  basis.  As  in  the  case  of  the  SCFA-MF-LINE  problem,  the 
above  Lemma  assures  that  replacing  a  leaving  variable  is  always  possible  on  such  an  occasion. 
Unlike  the  previous  case,  however,  this  situation  could  happen  in  practice^^.  Therefore,  the  selec¬ 
tion  mechanism  of  a  leaving  variable  is  mandatory  for  the  SCFA-MF-ETE  solution  module. 

Theorem  G-2 

Assume  >  0  for  Vji  6  IT .  Then,  at  the  end  of  an  iteration,  a  new  basis  matrix  can  be 

13.  This  situation  could  happen  if  more  than  one  commodity  flow  variables  of  7t  are  generated  and  if  the  demand  of  71  is 
carried  over  only  a  part  of  the  generated  paths.  If  link  /  is  contained  only  in  the  generated  commodity  paths  with  no  flow, 
then  the  only  (Jt,  /)  restoration  path  variable  in  the  basis  can  be  selected  as  a  leaving  variable.  Even  though  it  is  a  rela¬ 
tively  rare  event,  the  situation  has  occasionally  appeared  in  our  experiments. 
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arranged  in  the  standard  form  given  in  Equation  (F-l). 


(proof) 

Since  q^>0,  there  is  at  least  one  basic  commodity  flow  variable  for  each  commodity.  Thus 
it  is  possible  to  choose  a  key  column  per  commodity  and  to  arrange  the  first  set  of  columns. 
Lemma  G-1  ensures  that  at  least  one  key  restoration  flow  column  can  be  kept  in  the  basis. 
Thus,  it  is  possible  to  arrange  the  second  set  of  columns.  The  third  and  fifth  sets  of  columns 
can  be  trivially  arranged,  while  the  arrangement  of  the  fourth  set  of  columns  can  be  proved  by 
the  same  approach  as  in  Lemma  C-1. 

(Q.E.D.) 

The  following  three  Lemmas  deal  with  the  validity  of  the  procedures  employed  in  the  master 
process.  Lemma  G-3  proves  that  a  shortest  path  algorithm  and  the  proposed  QSP  algorithm  can  be 
applied  in  the  dual  feasibility  test.  Lemma  G-4  assures  the  invariability  of  the  LO  sub-matrices 
after  the  row  generation  and  deletion  procedure.  Finally,  Lemma  G-5  tells  how  to  detenmne  the 
simplex  multipliers  corresponding  to  inactive  (7C,  /)  constraints. 


Lemma  G-3 

At  the  end  of  the  sub-process,  the  following  two  inequalities  hold*'*. 
d  y  u^>0forVaeA 

> 0  for  Va e  A,'^le 


(proof) 

Obviously,  the  dual  feasibility  conditions,  ^  0  and  d^  -  ^  p^  >  0  (Equation  (E- 
6)),  hold  for  Va  e  A  and  V/  e  at  the  end  of  the  sub-process.  Thus,  it  suffices  to  show 
w"’  ^  >  0  to  prove  the  Lemma.  The  same  argument  as  in  Lemma  C-2  can  be  applied  to  show 

w"’  ^  =  y  p^  >  0  for  all  active  (rt,  0  rows,  where  p  is  the  key  restoration  path  for  com- 

^  K,l  . 

modity  %  against  a  failure  of  link  /.  For  non-generated  (7t,  /)  rows,  w  is  selected  as 

SpI,  ^  0 ,  where p  is  the  shortest  (ji,  1)  restoration  path  with  modified  arc  cost  { p.  }. 
ae  p  ° 

(Q.E.D.) 


14.  The  first  condition  is  necessary  for  our  proposed  QSP  algorithm  to  function  correctly.  See  Appendix  Q  for  details. 
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Lemma  G-4 


The  row  generation  and  deletion  procedures  in  the  master  process  do  not  change  the  value  of 
Z  =  L  O  in  Equation  (F-1), 


(proof) 

Suppose  that  the  (n,  /)  row  and  its  related  key  restoration  flow  variable,  r^’  ^ ,  are  generated 
in  the  master  process,  r^’  ^  is  the  only  basic  column  vector  with  a  non-zero  entry  in  the  (tt,  1) 
row  after  the  row  generation.  Therefore,  no  column  vector  of  F  =  D  -R  C  has  a 
non-zero  entry  at  the  (tc,  t)  row,  where  the  superscript,  new,  in  a  sub-matrix  indicates  the  sub¬ 
matrix  after  the  row  generation..  Thus, 


old 


where  the  superscript,  old,  in  a  sub-matrix  indicates  the  sub-matrix  before  the  row  generation. 
Since  there  is  no  change  in  the  sub-matrix  , 


^ew  _  ytiew^^  ^new 


=  Y' 


old 


ATj.lt 


Id 


It  I 

Now  suppose  that  the  (n,  t)  row  and  its  related  key  restoration  flow  variable,  r^’  ,  are 
removed  in  the  master  process.  Again,  r"’  ^  is  the  only  basic  column  vector  with  non-zero 
entry  in  the  (n,  1)  row  before  the  row  deletion,  and  the  same  argument  as  above  can  be  applied 
to  derive  . 

(Q.E.D.) 


Lemma  G-5 

Suppose  that  the  (;i,  /)  row  is  not  generated.  Let  p  be  the  shortest  restoration  path  for 
commodity  n  against  a  failure  of  link  /  in  the  network  with  the  arc  cost  { ^  Then,  the  dual 
feasibility  condition  (E-5)  is  guaranteed  for  all  (ti,  /)  restoration  path  variables  if  the  simplex 
multiplier  w^’  ^  is  selected  as  ^  =  ^11^. 

(proof) 

Due  to  the  structure  of  the  column  vector  of  the  restoration  variable,  the  simplex  multiplier 
is  given  by  w"’  ^  =  V  for  a  certain  path,  p  e  RP^’  ^ .  In  addition,  the  value  of  is 

determined  by  the  basis  matrix  B  and  independent  of  w"’  .  Thus,  in  order  to  satisfy  the  dual 
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feasibility  condition  (E-5)  for  all  (it.  1)  restoration  path  variables,  p  must  be  the  shortest  path  in 
G\  { /}  with  the  arc  cost  { } . 

(Q.E.D.) 

Finally,  the  following  two  theorems  assure  the  validity  of  the  proposed  algorithm. 

Theorem  G-6 

The  proposed  algorithm  terminates  in  a  finite  number  of  iterations. 

(proof) 

Let  2  be  a  set  of  all  commodity  flow  columns  and  restoration  flow  columns,  and  let  be 
the  subset  of  Q  containing  all  generated  columns  upon  the  A:-th  invocation  of  the  sub-process. 
Since  the  sub-process  is  guaranteed  to  produce  the  optimal  value  with  the  currently  generated 
columns,  2^+;  is  different  from  2jt  if  the  objective  function  value  improves  at  the  A:-th  itera¬ 
tion.  Since  the  objective  function  value  will  never  increase,  the  subset  2jt  will  not  appear  in  a 
later  iteration.  If  there  is  no  improvement,  then  the  subset  keeps  growing,  and  thus  it  does  not 
return  to  a  previous  state.  Since  the  number  of  subsets  of  Q  is  finite,  the  algorithm  terminates 
in  a  finite  number  of  iterations. 

(Q.E.D.) 


Theorem  G-7 

The  proposed  algorithm  terminates  with  a  globally  optimal  solution. 

(proof) 

When  the  algorithm  terminates,  dual  feasibility  is  assured  for  all  generated  rows.  Further¬ 
more,  Lemma  G-5  guarantees  dual  feasibility  for  all  non-generated  rows.  Since  the  solution  is 
also  primally  feasible,  it  is  globsilly  optimal  [53]. 

(Q.E.D.) 
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Appendix  H.  SVPR-MF-LINE:  Problem  formulation 


The  SVPR  problem  shares  a  common  structure  with  the  SCFA  problem.  A  major  difference 
resides  in  the  constraint  c)  where  full  restorability  constraints  are  replaced  by  restoration  flow  con¬ 
straints.  The  restoration  flow  constraints  refer  to  restoring  the  affected  flow  over  restoration  paths. 
If  there  is  not  enough  spare  bandwidth,  a  part  of  the  affected  bandwidth  could  be  lost.  In  order  to 
accommodate  this  phenomenon,  a  lost  flow  variable  t  =  (tj  (a  e  A)  is  introduced  in  the  for¬ 
mulation  of  the  SVPR  problem,  takes  the  amount  of  an  unrecoverable  flow  of  arc  a  due  to  a 
failure  of  link  /  (/3  a). 

Now,  the  SVPR-MF-LINE  problem  is  formulated  as  a  large-scale  linear  programming  as  in  the 
SCFA-MF-LINE  problem: 


Given  G  =  ( V,  A,  c)  and  (2,  (SVPR-MF-LINE) 

Minimize  L-\E\  =  ^ 

as  A 


over 

X  ^  0 ,  r  ^  0 ,  and  t  >  0 

subject  to  a) 

1.  ^P  =  ^ 

Vtie  n 

(H-1) 

peP* 

b) 

Vae  A,  Vie  E 

’  a 

(H-2) 

c) 

-fn+  y  rl  +  t  =  0 

a  p  a 

Vae  A 

(H-3) 

pe  RP" 


where /^=  p  •  and  6^  ^ 


The  arc-chain  flow  representation  is  employed.  The  constraints  (H-1),  (H-2)  and  (H-3)  corre¬ 
spond  to  the  flow  conservation  law,  the  capacity  constraints  and  the  restoration  flow  constraints, 
respectively.  In  the  SVPR  problem,  expresses  the  amount  of  total  restoration  flow  over  arc  a 
upon  a  failure  of  link  /.  The  restoration  flow  constraints  state  that  the  amount  of  an  unrecoverable 
flow  of  arc  a  plus  the  sum  of  restoration  flows  for  the  arc  is  equal  to  the  aic  flow  (the  amount  of  a 
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flow  to  be  recovered).  The  dual  formulation  of  the  above  problem  is  given  by  (SVPR-MF-LINE- 
Dual). 


Maximize 

It  It 

ite  n  ae  A 

.(Xfi) 

(SVPR-MF-LINE-Dual) 

over 

unrestricted,  ^  0 

subject  to 

i) 

a€  p''  Ett  ' 

^  o’'  Vp  e  p",  Vti  e  n 

(H-4) 

ii) 

fee  p 

Vpe  PP^Vae  A 

(H-5) 

iii) 

1 

a 

Va  e  A 

The  dual  variables  a" ,  and  are  the  simplex  multipliers  corresponding  to  the  constramts 
(H-1),  (H-2)  and  (H-3),  respectively. 


Appendix  L  SVPR-MF-LINE:  Solution  approach 

Due  to  the  similarity  in  their  formulations,  the  strategies  developed  for  the  SCFA-MF-LINE 
problem  can  be  applied  to  the  SVPR-MF-LINE  problem  with  a  slight  modification.  The  arc-chain 
flow  representation  and  the  column  generation  approach  are  employed  to  reduce  the  size  of  con¬ 
straints.  The  solution  process  is  decomposed  into  the  master  process  and  the  subprocess.  Only  a 
part  of  commodity  path  flow  variables  and  restoration  path  flow  variables  are  generated. 
The  master  process  tests  the  global  optimality  by  checking  the  conditions  (H-4)  and  (H-5).  A 
shortest  path  algorithm  can  be  applied  for  this  test.  In  the  subprocess,  LU  decomposition  of  the 
basis  matrix  is  calculated  by  exploiting  the  special  structure  of  the  matrix  and  is  manipulated 
through  the  direct  LU  update  mechanism  introduced  in  Section  3.2.2..  The  following  two  state¬ 
ments  can  be  shown  to  be  true  at  the  end  of  all  iterations  of  the  subprocess  if  g’'  >  0  for  Vti  e  11 ; 

1.  For  each  commodity  7t  e  fl ,  at  least  one  commodity  path  flow  variable,  (p  e  p" ),  is  in 
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the  basis. 

2.  For  each  arc  a  e  A ,  it  is  possible  to  maintain  at  least  one  restoration  path  flow  variable, 
ip  e  RP** ),  or  the  corresponding  lost  flow  variable,  ,  in  the  basis. 

Using  this  observation,  the  basis  matrix  B  can  be  arranged  in  the  following  standard  form  at  each 
iteration,  which  can  be  readily  decomposed  into  an  LU  form: 

*/  cl  /  lie 

R  I  D  _  R  I  I  F 

^  ^  A^H^  I  ~  I  ‘  /  Fj  '  ^ 

A^  A^H^  L  [  & 

where  F  =  D-RC,Y.  =  M^- Af-H.F  for  i  =  1, 2  and  Z  =  I  •  U  =  yj .  The  first  set  of  col¬ 
umns  corresponds  to  the  key  commodity  path  flow  variables,  and  the  second  set  contains  the  key 
restoration  path  flow  variables  or  lost  flow  variables.  The  remaining  commodity  and  restoration 
path  flow  variables  are  collected  into  the  last  set,  while  the  slack  variables  for  the  capacity  con¬ 
straints  are  arranged  in  the  third  set  of  columns.  Note  that  all  lost  flow  variables  in  the  basis  are 
placed  in  the  second  set.  All  basic  restoration  path  variables  for  arc  a  are  arranged  into  the  last  set 
if  the  corresponding  lost  flow  variable  is  in  the  basis.  The  first  set  of  rows  corresponds  to  the 
flow  conservation  laws,  the  second  to  the  restoration  flow  constraints,  and  the  last  two  represent 
capacity  constraints. 

Figure  I-l  illustrates  the  structure  of  the  solution  procedure.  The  structure  is  almost  identical  to 
that  for  the  SCFA-MF-LINE  problem.  A  major  difference  resides  in  the  initialization  process. 
Unlike  the  SCFA-MF-LINE  problem,  there  is  no  obvious  feasible  solution  satisfying  the  capacity 
constraints.  This  issue  can  be  circumvented  by  the  two-phase  approach  of  the  simplex  algorithm 
with  the  introduction  of  artificial  variables  [54].  Details  are  given  in  Appendix  R.I.  In  brief,  the 
initialization  process  takes  the  following  steps: 

1 .  Obtain  a  shortest  route  commodity  flow.  If  it  is  feasible,  go  to  Step  3. 

2.  Find  an  initial  feasible  solution  by  solving  an  LP  problem  with  artificial  variables. 

3.  Set  all  restoration  flow  variables  as  non-basic. 

4.  Calculate  the  amount  of  lost  flow  per  each  arc.  Set  all  lost  flow  variables  as  basic  and  place 
them  in  the  second  set  of  columns  in  Equation  (I-l). 
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Figure  1-1.  Solution  procedure  for  the  SVPR-MF-LINE  problem 


Appendix  J.  SVPR-MF-LINE:  Validity  of  the  algorithm 


The  following  Lemmas  and  Theorems  assure  the  correctness  of  the  proposed  algorithm 
described  in  the  previous  section.  Since  they  can  be  proven  based  on  the  same  arguments  as  those 
in  their  counterpart  in  the  SCFA-MF-LESfE  problem,  the  proof  is  omitted  here. 

Lemma  J-1 

Suppose  that  the  only  arc  restoration  flow  or  arc  lost  flow  variable  for  arc  a  is  selected  as  a 
candidate  for  a  leaving  variable  in  the  sub-process.  Furthermore,  suppose  that  an  entering 
variable  is  not  an  arc  restoration  flow  or  eu'c  lost  flow  variable  for  arc  a.  Then,  another 
candidate  can  be  always  found  for  a  leaving  variable. 
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Theorem  J-2 

Assume  q^>0  for  Vit  e  IT  .  Then,  at  each  iteration  of  the  sub-process,  it  is  possible  to 
select  a  leaving  variable  so  that  the  standard  form  of  the  basis  matrix  B  can  be  maintained. 

Lemma  J-3 

At  the  end  of  sub-process,  the  following  two  statements  are  trae: 

i)  No  negative  cycle  exists  in  the  network  G  with  arc  cost  {  ^  |i^}  . 

ii)  For  V/  6  £,  no  negative  arc  exists  in  the  network  G\  { /}  with  arc  cost  • 


Appendix  K.  SVPR-MF-ETE:  Problem  formulation 


The  end-to-end  restoration  scheme  recovers  failed  virtual  paths  between  their  source  and  desti¬ 
nation  nodes.  Thus,  a  lost  flow  variable  is  defined  for  each  combination  of  a  commodity  and  a  fail¬ 
ure  event.  Let  (ne  U,  I  e  E)  denote  a  lost  flow  variable  which  takes  the  amount  of  an 
unrecoverable  flow  of  commodity  TC  upon  a  failure  of  link  I,  and  define  a  lost  flow  vector 
t  =  (r"’  ^)  .  Then,  the  SVPR-MF-ETE  problem  is  formulated  as  follows: 


Given 

G  =  ( V,  A,  c)  and  Q, 

(SVPR-MF-ETE) 

Minimize 

L.\n=  2 

/  e  £7t  e  n 

over 

X  >  0 ,  r  >  0  and  t  >  0 

subject  to 

a)  X  Vk  e  n 

peP* 

(K-1) 

’’I)  +  ^a€A,\/leE^ 

(K-2) 

hi)  \/ae  A 

(K-3) 
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c)  -S  E  p  ^  ^p  ^  e  n,  V/ e  £ 

C  /  _  r»*t  _  ^  D  n**  ^ 


r^-4; 


«e  /pe  7>* 


pe 


where  /«  =  S  X 

Tte  Hpe  p* 


i  =  E  E  8„ 

Hpg  pp*  ' 


n,l 

p‘^P 


=  X  X  ®«./.p‘^P 

Upg  p* 


The  flow  conservation  law  and  the  restoration  flow  constraints  are  expressed  by  Equations  (K-1) 
and  (K-4),  respectively.  Unlike  the  previous  formulations,  two  types  of  capacity  constraints  are 
required:  one  in  the  event  of  a  failure  (Equation  (K-2))  and  the  other  under  normal  conditions 
(Equation  (K-3)).  Since  a  restoration  flow  is  additionally  sent  over  the  existing  flow  in  the  case  of 
line-restoration,  the  capacity  constraints  in  the  event  of  a  failure  ensure  the  feasibility  of  a  flow 
under  normal  conditions.  This  is  not  true  for  end-to-end  restoration  since  a  part  of  working  capac¬ 
ity  could  be  released  at  the  time  of  a  failure.  If  the  amount  of  the  released  bandwidth  at  arc  a  (r^ ) 
is  more  than  the  amount  of  the  total  restoration  flow  over  the  arc  (z^ )  upon  a  failure  of  link  /,  then 
Equation  (K-2)  does  not  imply  Equation  (K-3).  Thus,  Equation  (K-3)  is  necessary.  Corresponding 
to  this  new  constraint,  an  additional  simplex  multiplier  must  be  introduced  in  the  dual  formulation. 
Define  as  the  dual  variable  corresponding  to  the  constraint  (K-3),  and  let  -  M^a  • 
dual  formulation  of  the  above  problem  is  obtained  as  follows: 


Maximize  •  a"'  +  X  '  f  X 

Tien  aeA  '^le  E  ' 


(SVPR-MF-ETE-Dual) 


over 


a" ,  w"’  ^  unrestricted,  P^  ^  0 


subject  to  i)  ^ 

ae  p 


K,l3a 


w 


+  X  -  X 


le  E 


Is  p 
ail 


>  o"  V/7  e  P",  V7t  e  n  (K-5) 


ii)  S  (-!*„)  2 

asp 


^  71,/ 

W 


Vp  e  RP""'  \  Vtt  G  n,  V/  G  E  (K-6} 
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iii)  w"’'<l 


V7I6  n,  v/g  e 


Appendix  L.  SVPR-MF-ETE:  Solution  approach 


The  strategies  developed  to  solve  the  SCFA-MF-ETE  are  applicable  to  the  SVPR-MF-ETE 
problem  due  to  the  common  structures  of  the  two  problems:  the  arc-chain  flow  representation,  the 
column  generation  and  deletion  technique,  the  LU  decomposition  based  on  a  special  arrangement 
of  the  basic  variables,  the  direct  LU  update,  and  the  row  generation  and  deletion  technique.  The 
standard  basis  matrix  arrangement  and  its  LU  decomposition  are  given  by: 


I  c 

I 

I  C 

R  I  D 

R  I 

I  F 

Aj  I 

Ai//i  7 

^2  ^2 

Aj  T/j  L 

U 

where  F  =  D-RC,  Y.  =  M.-  A-C-H-F  for  i  =  1,2  and  Z  =  L-U  =  1^2  •  The  structure  of 
the  basis  matrix  is  equivalent  to  that  for  the  SVPR-MF-LINE  problem,  although  each  submatrix 
has  different  components.  Furthermore,  the  size  of  the  second  row  and  column  grows  explosively 
in  the  SVPR-MF-ETE  problem.  The  following  observation  allows  the  basis  matrix  to  be  arranged 
in  a  standard  form.  Assume  q  >0  for  Vti  e  11 .  Then  at  the  end  of  any  iteration  of  the  subprocess, 

1.  at  least  one  commodity  path  flow  variable,  {p  e  ),  is  in  the  basis  for  each  commodity 
71 G  n ,  and 

2.  it  is  possible  to  maintain  at  least  one  restoration  path  flow  variable,  ^  {pe  RP^’  ^ ),  or  the 
corresponding  lost  flow  variable,  ^ ,  in  the  basis,  for  each  combination  of  7t  g  n  and 
/g  E. 


Figure  L-1  depicts  the  structure  of  the  solution  procedure  for  the  SVPR-MF-ETE  problem.  In 
the  master  process,  a  dual  feasibility  test  is  conducted  through  a  shortest  path  algorithm  (Equation 
(K-5))  or  the  quadratic  shortest  path  algorithm  (Equation  (K-6))  described  in  Appendix  Q.  An  ini¬ 
tial  feasible  solution  can  be  obtained  in  the  same  manner  as  in  the  SVPR-MF-LINE  problem. 
Refer  to  Appendix  R.2  for  details. 
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Master  Process  -  Check  global  optimality 

1 .  If  the  objective  function  value  improves  during  the  previous  sub-process, 

a)  Remove  all  flow  variables  which  stay  out  of  the  basis  for  a  long  period. 

b)  Remove  inactive  restoration  flow  constraints  (rows). 

2.  Calculate  simplex  multipliers  for  all  constraints. 

3.  Check  Xp-related  dual  feasibility.  Generate  columns/rows  if  necessary. 

4.  Check  rp-related  dual  feasibility.  Generate  columns  if  necessary. 

5.  If  no  column  is  generated,  then  return.  - — - 


Sub-Process  -  Perform  a  revised  simplex  algorithm  on  generated  columns/rows 

1 .  Calculate  simplex  multipliers  for  currently  generated  constraints. 

2.  Pricing-out  operation.  Select  an  entering  variable.  If  there  is  no  candidate,  return. 

3.  Ratio  test.  Select  a  leaving  variable  so  that  a  standard  form  can  be  maintained. 

4.  Update  a  basis  and  its  LU  factorization  through  a  direct  LU  update.  Go  back  to  1. 


Figure  L-1.  Solution  procedure  for  the  SVPR-MF-ETE  problem 


Appendix  M.  SVPR-MF-ETE:  Validity  of  the  algorithm 

The  arguments  developed  for  the  SCFA-MF-ETE  problems  as  well  as  the  SVPR-MF-LINE 
problem  can  also  be  applied  to  verify  the  proposed  algorithm.  This  section  lists  related  Lemmas 
and  Theorems  for  the  sake  of  completeness. 
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Lemma  M-1 


Consider  an  active  (7C,  1)  constraint.  Suppose  that  the  only  (jt,  /)  related  variable  (restoration 
path  variable  or  lost  flow  variable)  in  the  basis  is  selected  as  a  leaving  variable  at  some  point 
in  the  sub-process.  Furthermore,  suppose  that  an  entering  variable  is  not  a  (K,  1)  related 
variable.  Then,  we  can  always  find  another  candidate  for  a  leaving  variable. 

Theorem  M-2 

Assume  q^>0  for  V;c  e  IT .  Then,  at  the  end  of  an  iteration  a  new  basis  matrix  can  be 
arranged  in  the  standard  form  given  in  Equation  (L-1). 

Lemma  M-3 

At  the  end  of  the  sub-process,  the  following  two  inequalities  hold, 
w  -  u  >  0  for  V<2  e  A 

<  0  for  Va  e  A, 'il& 

Lemma  M-4 

The  row  generation  and  deletion  procedures  in  the  master  process  do  not  change  the  value  of 
Z^L-  fj  in  Equation  (L-1). 

Lemma  M-5 

Suppose  that  the  (ti,  /)  row  is  not  generated.  Let  p  be  the  shortest  restoration  path  for 

commodity  Ji  against  a  failure  of  link  I  in  the  network  with  the  arc  cost  {-p^  }.  Then,  the 

dual  feasibility  condition  (K-6)  is  guaranteed  for  all  (ti,  /)  restoration  path  variables  if  the 

simplex  multiplier  w"’  is  selected  as  =  -V  . 

€  /?  ^ 

Theorem  M-6 

The  proposed  algorithm  terminates  in  a  finite  number  of  iterations. 
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Theorem  M-7 

The  proposed  algorithm  terminates  with  a  globally  optimal  solution. 


Appendix  N.  SVPR-KSP-LINE:  Problem  formulation 


Different  problem  formulations  can  lead  to  different  solution  procedures.  We  consider  two  for¬ 
mulations  which  differ  in  the  way  they  accommodate  restoration  flow  constraints.  The  first  formu¬ 
lation  (SVPR-KSP-LINEl)  embeds  such  constraints  into  an  objective  function. 


Given 

G  =  (V,A,c)  and 2. 

(SVPR-KSP-LINE1) 

Minimize 

i(t)  =  ra  -  S'-/® 

'  ‘  leE 

(N-l) 

over 

f>0 

(N-2) 

subject  to 

a) 

VKen,Vvey 

a  €  A 

(N-3) 

b) 

f  =  y  /^<c  Va€  A 

J  a  Au  ■'a  a 

Jt  €  n 

(N-4) 

(-1  (if  arc  a  leaves  node  v) 

where  (j)  =  i-1  (if  arc  a  enters  node  v) 

Vf  Cl  1 

0  (otherwise) 

<=  -1 


1 

-1 

0 


(if  V  is  an  originating  node  of  conunodity  7t) 
(if  V  is  a  destination  node  of  commodity  7t) 
(otherwise) 


Equations  (N-3)  and  (N-4)  embody  the  flow  conservation  law  and  the  capacity  constraints,  respec¬ 
tively.  The  arc-node  flow  representation  [9]  is  employed  on  the  commodity  flow,  f  ^  takes  the 
amount  of  flow  for  commodity  7i  over  arc  a.  L  (f)  gives  an  expected  lost  flow,  while  (f)  is  a 
lost  flow  due  to  a  failure  of  link  /.  Based  on  the  above  formulation,  the  problem  becomes  a  non-lin¬ 
ear,  non-smooth  multicommodity  flow  problem  with  linear  constraints. 
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In  the  calculation  of  an  expected  lost  flow,  we  assume  that  it  is  possible  to  find  a  subset  of 
affected  virtual  paths  which  completely  fill  up  the  bandwidth  of  a  discovered  restoration  route. 
Since  each  link  (i.e.  Gbps  -  Tbps  capacity  for  optical  fiber)  is  shared  among  a  number  of  commod¬ 
ities  and  each  commodity  is  supposed  to  involve  a  large  number  of  virtual  paths,  a  single  link  will 
be  shared  by  a  significant  number  of  virtual  paths.  This  means  that  the  bandwidth  of  a  virtual  path 
will  be  considerably  smaller  than  a  link  bandwidth  and  the  spare  bandwidth  over  a  restoration 
route.  Therefore,  even  though  any  subset  of  virtual  paths  may  not  actually  fit  into  the  bandwidth  of 
a  restoration  route,  such  a  gap  should  be  small,  and  the  formulation  based  on  this  assumption 
would  give  a  good  approximation  of  the  problem. 


Now,  (f )  is  computed  as  follows;  Consider  a  restoration  process  upon  a  failure  of  link  /.  Sup¬ 
pose  that  link  /  consists  of  two  directed  arcs,  a  and  b,  and  assume  that  the  restoration  for  each  arc  is 
processed  in  parallel.  Let  ’  (f)  denote  the  residual  capacity  of  arc  P  e  AN  {a,  after  the 

link  restoration  using  the  first  k  restoration  routes  for  arcs  a  and  b.  It  is  given  by  the  following 
recursive  formula: 


=  ^p-/p 

_  (f) ifPePP^”’^^  forsomeae  {a,b} 

where  is  a  set  of  directed  arcs  which  are  contained  in  the  it-th  shortest  restoration  route 

of  arc  a  G  {a,  b}  .  Note  that  since  a  link  is  assumed  to  be  bidirectional,  the  ^-th  shortest  restora¬ 
tion  routes  for  arcs  a  and  b  pass  through  the  same  links  but  in  opposite  directions.  Thus, 

RP  ’  ^  O  =  0 .  r”  (f)  is  a  restorable  amount  of  flow  of  arc  a  G  /  over  the  k-th 

RP 

restoration  route.  It  is  given  by 


a 

RP 


(o,t)  (C)  =  min 


Jt-1 


fa-J,  (0 ’  {Cp  ’ (f)  :P  e  PP^“’ } 
/=  1 


(N-5) 


Namely,  ^^^^(0,*)  (C)  is  calculated  as  the  minimum  of  (a),  the  amount  of  flow  which  cannot  be 
recovered  over  the  first  k-\  restoration  routes,  and  (b),  the  restorable  amount  of  flow  over  the  it- 
th  restoration  route.  The  latter,  (b),  is  equal  to  the  minimum  residual  arc  capacity  over  the  route. 
Then,  the  total  amount  of  restorable  flow  of  arc  a  g  /,  RST^  (f)  ,  is  obtained  by 
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RST^it)  =  2 

*6  i?P“ 

where  /?P“  is  a  set  of  restoration  routes  for  arc  a.  Finally,  (f)  is  computed  by 
h  (f )  =  (/«  -  RST^  (D )  +  ( A  -  RSTi,  (f) ) 

The  second  formulation  (SVPR-KSP-LINE2)  describes  the  restoration  flow  constraints  with  the 
aid  of  the  restoration  path  variables  and  the  lost  flow  variables.  This  leads  to  a  similar  formulation 
to  the  SVPR-MF-LINE  problem,  but  additional  constraints  are  required  in  order  to  express  the 
relationship  given  by  (N-5).  We  call  such  constraints  the  KSP  constraints.  Although  this  formula¬ 
tion  can  eliminate  the  non-smoothness  from  the  objective  function,  the  constraint  set  becomes 
non-convex. 


Given 

G  =  (V,A,c)  and (2. 

(SVPR-KSP-LINE2) 

Minimize 

II 

ae  A 

over 

s=  (Ar')ao,t  =  (/^„„) 

>0,  and  t  =  (t^)  >0 

subject  to 

a) 

X-<  Jt  It 

X  =  « 

Vtcg  n 

(N-7) 

peP* 

b) 

Va  e  A,  V/  e 

(N-8) 

c) 

-fa+  X  = 

0  \/ae  A 

(N-9) 

Jte  [RP”] 


d)  r 


a  e  I 


RP 


(a.k) 


=  mm 


{4-  S  v..»-  [«?<-■«]}} 

^  1  =  1  " 


A,  \/ke  [RP  ] 


(N-10) 


where 

[RP**] 
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J.a=  {a,b},k-l) 


k-l 


a 


1  =  1 


RPl°''‘,  RP*"’"  ^  pp'*'’ 


'■^p(i>..)] 


The  first  three  constraints  (flow  conservation  law  (N-7),  capacity  constraints  (N-8)  and  restora¬ 
tion  flow  constraints  (N-9))  are  almost  the  same  as  those  for  the  SVPR-MF-LINE  problem,  except 

that  the  restoration  paths  are  restricted  due  to  a  hop  limit  (see  Section  2.1.2.)^^.  denotes 

(I  k) 

the  h-th  arc  in  the  k-th  restoration  path  for  arc  a,  and  [X]  =  { 1 , . . . ,  |  X| }  .As  before,  ’  gives 
the  residual  capacity  of  arc  P  after  restoration  using  the  first  k  restoration  paths  upon  a  failure  of 
link  /.  The  KSP  constraints  are  given  by  Equation  (N-10).  Each  KSP  constraint  can  be  replaced  by 
the  following  two  inequalities: 


ae  I 
^pp(‘’*> 


:he  [RP 


(a,  k) 


(N-ll) 


a  €  / 

'’ppio.k) 


t  Septet) 


:he  [RP 


(a,  k) 


(N-I2) 


Note  that  Equation  (N-ll)  is  satisfied  by  constraints  (N-8)  and  (N-9),  and  thus  it  is  unnecessary. 


Appendix  O.  SVPR-KSP-LINE:  Solution  approach  -  Modified 
flow  deviation  (MFD)  method 


The  SVPR-KSP-LINE  problem  based  on  the  first  formulation  (SVPR-KSP-LINEl)  is  a  non-lin¬ 
ear  (non-smooth)  programming  problem  with  linear  constraints.  Such  a  problem  can  be  addressed 
by  an  iterative  method  where  a  new  feasible  flow  is  found  at  each  iteration,  and  the  value  of  the 
objective  function  monotonically  decreases.  In  order  to  maintain  the  feasibility,  not  only  must  a 
flow  be  a  legal  multicommodity  flow^^  [9],  but  also  the  capacity  constraints  (N-4)  must  be 

15.  Thus,  the  first  three  constraints  are  same  as  the  constraints  for  the  SVPR-MF-LINE  problem  with  an  RP  option. 
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checked  at  each  iteration.  The  method  also  needs  to  find  an  initial  feasible  flow,  which  may  be 
obtained  through  a  linear  programming  algorithm.  Instead  of  solving  the  above  problem  directly, 
we  relax  the  capacity  constraints  as  in  [28].  If  the  total  flow  of  each  link  is  allowed  to  exceed  its 
capacity,  the  capacity  constraints  can  be  removed  from  the  formulation,  and  the  iterative  procedure 
is  greatly  simplified.  Moreover,  an  initial  solution  is  now  obtained  through  a  shortest  path  algo¬ 
rithm  which  is  less  computationally  intensive  than  a  linear  programming  approach  especially  for  a 
large  network. 

Let  £2  4  =  { f ;  f  is  a  multicommodity  flow  (f  satisfies  (N-2)  and  (N-3))}  and  £2^  =  £2^  n  {  f ; 

f  satisfies  (N-4)}.  The  removal  of  the  capacity  constraints  means  expanding  a  feasible  region  from 

£2d  to  £2 . .  Then,  in  order  to  avoid  the  convergence  at  a  point  in  £2^\£2g ,  it  is  necessary  that  all 
BA 

local  minima  reside  in  £2^ .  This  is  accomplished  by  introducing  a  penalty  function  and  modifying 
the  definition  of  L  as  follows: 

=  III  •  L  V 

where 

=  L^(f)  +(o-  Yi  max{0,/^-cJ 
ae  A 

Thus,  if  there  is  an  arc  a  e  A  where  the  capacity  constraint  is  violated,  the  excess  flow  f  is 
treated  as  a  big  lost  flow  by  adding  ©  •  (/^  -  c^)  to  (f )  for  all  /  g  £ . 

The  following  Lemma  gives  a  sufficient  condition  on  the  multiplier,  co,  in  order  to  guarantee  the 
non-existence  of  local  minima  outside  £2 ^ .  This  suggests  that,  if  an  iterative  method  can  always 
find  a  feasible  descent  direction,  it  can  eventually  find  a  point  in  £2g  with  a  lower  value  of  L  , 
even  if  it  starts  from  a  point  in  ■ 

Lemma  0-1 

Suppose  ©  >  |A|  and  £2g  ^  0 .  Then,  L*  (f)  has  no  local  minima  in  £2^\£2g . 

(Proof) 

16.  In  the  minimum-cost  multicommodity  flow  problems,  a  multicommodity  flow  is  defined  to  be  a  flow  satisfying  the 
flow  conservation  law  (N-3)  and  the  non-negativity  constraints  (N-2)  [9]. 
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Let  f  e  and  let  A'  (A'  c  A)  be  a  set  of  the  arcs  whose  flow  exceeds  the  capacity  in 

f .  Then,  for  an  arc  a  e  A'  there  is  an  f’'  -incrementing  path  [16],  ,  from  to  for  some 

commodity  7t  g  11 ,  where  a  =  kj  .  Otherwise,  the  excess  flow  =  f^-  cannot  be 

rerouted,  contradicting  the  assumption  0 .  Let  FWD^  be  a  set  of  the  forwarding  arcs 
along  and  be  a  set  of  the  reverse  arcs  along  P^.  Define 

/(P^)  smin{i^  I  b&  P^}  where  \fb&  FWD^  and  =  /^  if  fee 

Let  e  =  min{e^, j(P^)}  >0  and  f*  be  the  new  flow  which  is  obtained  from  f  by 
decreasing  and  (Vfe  €  RVS^  by  e  and  increasing  (Vfe  g  FWDJ  by  e.  Appar¬ 
ently,  f  *  e  .  Note  that  the  restorable  flow  against  a  failure  of  link  /  e  E  may  be  reduced  by 
the  increase  of  a  flow  in  the  forwarding  arcs,  namely  at  most  by  |FlVDj  •  e  <  (|A|  -  1)  •  E . 

Let  =  {le  E  I  (a,  g  FWDJ  u  (02  g  FWD^)  ,  /=  {a^,  02}  }  •  Then,  for  a  link 

I  g  ,  we  have 

L*  (f*)  < L^*  (f)  - (0  •  e  +  (|A|  - 1)  •  e  (0-2) 

For  a  link  I  g  E^^^j  ,  the  increased  amount  of  the  flow  in  the  forwarding  arc  may  not  be 
restored  either.  Thus, 

<L;*(f)  -  ©•£  +  (|A|-1)  -e  +  e  (0-3) 

Note  that  the  link  containing  the  arc  a  is  not  in  Epy^^^  and  thus. 

Using  the  Equations  (0-1)  (0-2)  (0-3)  (0-4)  and  the  relation  co  >  |A| ,  we  can  derive 
L*  (f*)  <  L*  (f)  -  e  ■  m  -  /1F|  <  L*  it) 

This  implies  that  there  always  exists  a  feasible  descent  direction  at  f  if  f  g  Q.^\Q.g .  There¬ 
fore,  L*  (f)  has  no  local  minima  outside  . 

(Q.E.D.) 

Note  that  although  the  above  proof  assumes  =  l/|Fl ,  any  arbitrary  weight  on  the  failure  events 
will  produce  the  same  result. 

In  summary,  the  survivable  virtual  path  routing  (SVPR-KSP-LINEl)  problem  is  reformulated 
as  follows,  which  we  call  the  relaxed  SVPR-KSP-LINE  (RSVPR-KSP-LINE)  problem: 
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Minimize 

f(t)  =i4-  Iv® 

'  '  leE 

(RSVPR-KSP-LINE) 

over 

f^o 

subject  to 

flow  conservation  law  (N-3). 

Row  deviation  method  [22]  [26]  is  an  efficient  algorithm  for  nonlinear  multicommodity  flow 
problems  with  convex  constraint  sets  [13].  In  order  to  calculate  a  search  direction  of  the  next  itera¬ 
tion,  the  partial  derivatives  of  the  objective  function  are  evaluated  at  the  convergence  point  of  each 
iteration.  In  the  RSVPR-KSP-LINE  problem,  however,  its  objective  function,  L* ,  is  not  differen¬ 
tiable  everywhere.  Although  a  descent  direction  at  f  could  be  obtained  if  the  objective  function  is 
differentiable  at  f ,  it  is  plausible  that  L*  is  not  differentiable  at  the  convergence  point  of  each  iter¬ 
ation.  This  is  because  L*  is  a  piece-wise  linear  function,  and  a  non-smooth  point  (kink)  of  such  a 
function  gives  the  lowest  value  in  the  search  direction.  Thus,  the  flow  deviation  method  cannot  be 
applied  to  the  RSVPR-KSP-LINE  problem  directly. 

An  alternative  way  to  find  a  feasible  descent  direction  is  to  minimize  the  directional  derivative 
over  all  feasible  directions.  Since  any  feasible  direction  at  f  can  be  expressed  by  y  -  f  for  some 
ye  ,  the  problem  is  formulated  as  a  minimization  of  DD(y,f)  over  y e  , 

where  DZ)  (y,  f)  is  the  direction^  derivative  of  L*  (f)  along  y  -  f .  Due  to  non-smoothness  of 
L* ,  however,  the  directional  derivative  cannot  be  calculated  just  through  a  gradient.  In  order  to 
overcome  the  difficulty,  the  following  approximation,  DD*  (y,  f)  ,  is  employed  instead  of 
£)D(y,f): 

fD*(v,f)  =s*®  "ilrf 


where  g*  (f)  =  [g*  (f)  ]  (;  e  A) 


gjHi)  = 


g^a)=  lim  {L*if  +  h-e.)-L*{f)}/hand 

i_  . 


h-^0 


-  145- 


5;(f)=  lim  {L*(f  +  /i  ep-L*(f)}//i 

h-^O' 

where  e.  is  a  unit  vector  with  the  i-th  component  equal  to  one.  Note  that  DD*  (y,  f)  is  a  direc¬ 
tional  derivative  of  a  piece-wise  linear  approximation  of  L*  using  one-sided  partial  derivatives, 
8*  (f)  and  g]  (f)  . 


Now,  the  problem  is  to  minimize  DD*  (y,  f)  over  y  e  {f }  .  Consider  applying  the  flow 
deviation  method  for  this  new  minimization  problem.  Again,  the  objective  function  DD*  (y,  f)  is 
not  differentiable  everywhere.  However,  the  only  non-differentiable  region  is 
-^  =  {y  \^i-  fi  s  ^)  }  •  The  partial  derivative  of  DD*  (y,  f)  (except  at  the  points  in  A ) 
can  be  obtained  as 


(0-5) 


When  v.-f.  =  0,  we  heuristically  define  g.*  =  (g^  +  g-)/2  in  Equation  (0-5)  in  order  to 
compute  the  partial  derivative.  According  to  our  numerical  experimentations,  v^.  -f.  =  0  seldom 
happens  during  the  iterations. 


DD*  (y,  f)  is  not  convex  nor  concave,  so  the  flow  deviation  method  gives  only  a  stationary 
point  of  DD*  (y,  f )  .  With  a  good  initial  point  y® ,  however,  it  has  a  good  chance  of  converging  to 
a  near-optimal  point.  If  L*  is  differentiable  at  f ,  it  is  known  that  y  -  f  gives  a  feasible  descent 
direction  [22]  where  y  is  a  shortest  route  flow  obtained  under  the  metric  {dL*/df.}  ,  which  is 
equal  to  g.  =  g.  =  (gj  +  g.)  /2 .  Although  L*  is  not  usually  differentiable  at  the  convergence 
point  of  each  iteration,  y®  -  f  would  give  a  good  approximation  to  a  descent  direction  if  y®  is  a 
shortest  route  flow  under  the  metric  {  (gj  +  g'.)  /2}  .  Therefore,  using  the  y°  as  a  starting  point, 
the  method  could  at  least  reach  a  local  minimum  with  a  negative  directional  derivative  and  possi¬ 
bly  reach  the  minimum  of  DD*  (y,  f)  . 

The  proposed  algorithm  is  summarized  as  follows,  which  we  call  the  modified  fiow  deviation 
(MFD)  method: 
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Procedure  SVPR-KSP-LINE-MFD() 

0.  Find  a  feasible  starting  point  e  £1^ .  can  be  a  shortest  route  flow  if  no  apparent  feasi¬ 
ble  solution  is  available.  Let  n  =  0 . 

1.  Through  the  following  procedures,  obtain  y"  such  that  y"-f"  would  be  the  steepest 
descent  direction. 

a)  Find  a  feasible  starting  flow  y°  f"  in  £2^ .  y®  is  a  shortest  route  flow  under  the  metric 
{(g* +  g'^/2}  .Letm  =  0. 

b)  Obtain  w'”  as  a  shortest  route  flow  taking  {dDD*  (y”*,  f”)  /dv.}  as  an  arc  length. 

c)  =  ( 1  -  n”*)  y'"  + 

where  p,”*  is  the  minimizer  of  DD*  ( ( 1  -  p'")  y”  +  p^w”  ) ,  f")  (0  <  p"  <  1 ) . 

d)  If  DD*  (y'",  f")  -  DD*  (y'”  ^  f”)  <  5 ,  then  y”  =  y'”  ^  and  go  to  Step  2.  Otherwise, 
let  m=m+l  and  go  back  to  Step  1-b). 

2.  =  (l-X”)f"  +  XV, 

where  is  the  minimizer  of  L*  ( ( 1  -  2.”)  f"  +  X”y”)  (0  <  2,"  s  1) . 

3.  If  L*  (f"  =0  or  (L*  (f")  -  L*  (f” *  S  )  (C")  <  £.  then  stop.  Otherwise,  let  n=n+l 

and  go  back  to  Step  1. 

The  MFD  method  involves  two  iterative  procedures,  outer  iteration  (Step  1-3)  and  inner  iteration 
(Step  l-b~d).  The  inner  iteration  seeks  the  steepest  descent  direction  at  each  loop  of  the  outer  iter¬ 
ation.  Using  the  result  of  the  inner  iteration,  the  outer  iteration  proceeds  towards  a  minimum  point. 
A  golden  section  search  method  [54]  is  used  to  implement  the  line  search  in  Step  1-c)  and  Step  2). 

Implementation  Issues 

Since  L*  (f)  and  g*  (f)  have  no  closed  form  solutions,  they  can  be  calculated  only  numeri¬ 
cally.  The  calculation  of  L^*  (f)  involves  repeated  application  of  the  shortest  path  algorithm  to 
find  the  successively  shortest  restoration  routes  over  the  residual  network.  Assuming  the  Dijkstra’s 
algorithm  with  binary  heap  [14],  it  requires  O  (i4^1og  V)  time  in  the  worst  case  since  at  least  one 
arc  is  removed  from  the  residual  network  at  each  application  of  the  shortest  path  algorithm.  Then, 
the  total  running  time  for  L*  (f)  and  g*  (f)  goes  up  to  O  (A^logV)  and  O  (A^log  V)  ,  respec¬ 
tively.  Although  their  complexities  are  polynomial,  they  slow  down  the  algorithm  considerably 
since  these  operations  are  invoked  at  every  outer  iteration.  In  order  to  reduce  the  computational 
time,  a  precalculated  table  of  the  restoration  routes  is  used  in  our  implementation.  Although  the 
size  of  such  a  table  grows  exponentially  as  the  size  of  the  network  increases,  our  preliminary  study 
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shows  that  it  is  enough  to  consider  the  first  30  shortest  restoration  routes  to  obtain  a  very  close 
approximation  of  L*  (f)  .  This  strategy  reduces  the  computational  complexities  of  L*  (f)  and 
g*  (f)  down  to  O  (A)  and  O  (A^) ,  respectively.  Table  0-1  shows  required  CPU  time  for  the 
MFD  method.  The  same  test  conditions  as  those  used  in  the  SVPR-MF  problems  are  employed  in 
this  experiment.  The  result  indicates  that  the  MFD  method  takes  very  short  time  and  is  applicable 
to  a  network  with  rapidly  changing  traffic  demand  patterns. 

The  MFD  method  employs  several  optimization  parameters.  Using  two  sample  networks,  we 
have  examined  the  relationship  of  the  optimization  parameter  values  to  the  performance  of  the 
algorithm.  Tens  of  demand  patterns  have  been  examined  for  each  model  to  see  the  behavior  of  the 
algorithm  in  a  lightly  loaded  situation  as  well  as  a  heavily  loaded  case.  Some  are  uniformly  distrib¬ 
uted  and  some  are  randomly  distributed. 

There  are  three  parameters  which  might  affect  the  performance  of  the  algorithm:  the  stopping 
conditions,  e  and  5,  and  the  step  size,  h.  The  stopping  conditions  are  employed  in  the  termination 
tests  in  the  outer  and  inner  iterations,  while  the  step  size  is  used  to  numerically  calculate  the  partial 
derivatives,  g*  (f)  .  It  has  been  found  that  the  stopping  condition  e  does  not  have  much  to  do  with 
the  performance  of  the  algorithm  as  long  as  it  is  sufficiently  small.  Furthermore,  when  the  load  is 
light,  the  selection  of  the  parameters  does  not  make  significant  difference  in  the  convergence  speed 
as  long  as  they  are  reasonably  small.  Since  the  optimal  solution  set  is  not  a  singleton  in  this  case, 
the  procedure  can  find  some  point  in  the  set  rather  easily  regardless  of  the  parameter  values.  In 
general,  the  convergence  point  depends  on  the  choice  of  the  parameters,  but  this  is  not  a  problem 
from  the  survivability  point  of  view,  since  a  minimum  expected  lost  flow  is  attained. 

In  a  heavily  loaded  situation,  however,  the  convergence  speed  greatly  depends  on  the  selection 
of  the  two  parameters,  h  and  5.  Figure  0-1  illustrates  typical  transitions  of  L*  over  iterations  for 
various  values  of  h  and  5.  The  results  on  the  NJ-LATA  and  US-WAN  sample  networks  are 

Table  0-1.  Required  CPU  time  for  the  MFD  method 


Network  Model 

NJ- 

LATA 

(12,50) 

(24,60) 

US- 

WAN 

100%  RNL  with  30  random  variations 

1.96  s 

5.82  s 

2.53  s 

17.6  s 

102.5%  RNL  with  no  variation 

2.05  s 

9.53  s 

11.4  s 

46.2  s 
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reported.  The  latter  case  has  a  heavier  offered  load  than  the  former.  Three  curves  are  plotted  for 
each  case  representing  small  (case  1),  medium  (case  2)  and  relatively  large  (case  3)  values  of  h. 
Two  additional  curves  are  depicted  for  different  values  of  6.  Note  that  the  following  discussion  on 
the  convergence  is  a  generalization  of  the  results  of  many  experiments  with  different  traffic  pat¬ 
terns.  There  is  no  definite  rule,  and  the  convergence  pattern  is  different  in  each  case. 

Generally  speaking,  a  smaller  step  size,  h.  is  thought  to  be  desirable  since  it  gives  a  more  precise 
approximation  of  the  partial  derivatives,  and  this  is  the  case  when  the  load  is  not  so  heavy.  In  a 
heavily  loaded  situation,  however,  the  procedure  often  stops  prematurely  after  a  sharp  decrease  in 
the  first  few  iterations  (Figure  0-1 -a:  case  1).  This  phenomenon  becomes  more  prominent  when  the 
load  is  too  heavy  (Figure  0-1 -b).  The  premature  convergence  is  due  to  a  lack  of  global  information 
on  the  change  of  L* .  A  search  direction  obtained  through  a  smaller  step  size  might  be  descent  only 
in  the  neighborhood  around  f .  This  area  is  usually  small  for  heavily  loaded  networks  since  a  small 
flow  change  at  one  link  can  readily  affect  the  restorability  of  some  other  links.  Since  the  minimum 
point  along  the  search  direction  often  falls  near  f ,  enough  progress  cannot  be  made  at  each  itera¬ 
tion.  As  a  result,  a  decrease  of  L*  is  too  small  to  pass  the  termination  test,  and  premature  termina¬ 
tion  follows. 


(a)  NJ-LATA  Sample  Network 


(b)  US-WAN  Sample  Network 


Figure  0-1.  Typical  transition  of  L*  over  iterations 
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Furthermore,  the  accuracy  of  the  linear  approximation  of  a  nonsmooth  function  might  degrade  in 
a  heavy  load,  since  lots  of  turning  points  (kinks)  are  expected  to  exist  in  a  small  region.  It  is  well- 
known  that  the  steepest  descent  method  could  fail  around  a  kink  if  it  is  applied  to  a  non-smooth 
optimization  [49].  This  is  because  the  linear  approximation  based  on  a  gradient  may  not  be  able  to 
take  into  account  a  large  change  in  the  objective  function  value  at  a  non-smooth  point,  even  if  the 
point  is  located  in  the  neighborhood  (see  Figure  0-2).  Premature  convergence  with  a  smaller  h  is 
also  explained  by  this  inaccuracy  of  the  linear  approximation. 

A  larger  step  size  is  generally  considered  undesirable  since  it  may  mislead  to  a  false  search 
direction  due  to  a  possible  error  in  obtaining  derivatives.  In  fact,  the  procedure  converges  very 
slowly  as  shown  in  Figure  0-1.  However,  as  the  load  grows,  only  a  relatively  large  h  can  approach 
an  optimal  point  (Figure  0-1-b).  Since  the  next  direction  is  obtained  by  seeking  a  broader  range 
around  f ,  the  minimum  point  along  the  search  direction  tends  to  reside  away  from  f .  Namely,  the 
procedure  can  find  a  longer  descent  slope,  although  it  might  not  be  the  steepest  descent  direction  at 
f .  Consequently,  it  usually  produces  sufficient  decrease  to  prevent  premature  convergence  even  in 
a  heavily  loaded  network,  although  the  attained  improvement  rate  might  be  low.  In  other  words, 
the  linear  approximation  of  L*  with  a  large  step  size  is  not  precise  locally  but  gives  a  global  view. 
This  approximation  works  well  in  heavily  loaded  situations  since  the  effect  of  dense  non-smooth 
points  is  smoothed  out  over  a  relatively  large  neighborhood.  Premature  convergence  at  a  kink  can 
be  avoided  as  shown  in  the  example  of  Figure  0-2-b.  In  addition,  a  search  direction  obtained 
through  a  larger  step  size  might  find  a  direction  with  a  very  long  downhill,  and  this  results  in  an 
occasional  large  reduction  of  L* . 

In  summary,  the  best  choice  of  step  size  greatly  depends  on  the  traffic  load.  A  small  h  gives  the 
best  result  in  a  lightly  loaded  case,  while  with  a  heavy  load  it  shows  a  quick  decrease  but  results  in 
a  premature  convergence.  Thus,  a  larger  step  size  is  required  as  the  load  grows,  but  it  should  not  be 
unnecessarily  large  in  order  to  avoid  longer  iterations  (see  Figure  0-1-a).  Since  load  information  is 
not  usually  obtainable  before  starting  the  procedure,  it  is  very  hard  to  choose  a  single  step  size  to 
give  the  best  result.  One  possible  guideline  is  to  change  the  value  over  the  iterations.  The  proce¬ 
dure  starts  with  a  smaller  step  size  to  achieve  fast  convergence  for  a  lightly  loaded  case  or  to 
induce  a  quick  decrease  of  L*  .  Then,  it  gradually  increases  the  step  size  to  obtain  a  better  solution 
for  a  heavily  loaded  case.  Based  on  the  experiments,  this  strategy  generally  works  well,  although 
in  some  cases  more  iterations  are  required. 
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The  stopping  condition,  5,  influences  not  only  the  required  number  of  inner  iterations  but  also 
that  of  outer  iterations.  Obviously,  a  smaller  number  of  inner  iterations  are  necessary  for  a  larger  6. 
On  the  other  hand,  a  smaller  5  requires  fewer  outer  iterations  since  a  steeper  downhill  could  be 
found  at  each  iteration.  Considering  the  fact  that  the  outer  iteration  is  more  computationally 
expensive  than  the  inner  iteration,  it  is  better  to  use  a  smaller  value  for  5.  Furthermore,  a  large  5 
may  cause  premature  convergence  since  the  minimization  process  of  DD*  can  terminate  before 
finding  a  descending  direction.  The  value  of  5,  however,  should  not  be  unnecessarily  small.  The 
objective  function  of  the  inner  iteration  is  just  an  approximation.  Fine  tuning  on  such  a  function 
might  not  always  lead  to  a  better  solution  unless  the  approximation  is  very  precise.  In  fact,  a 
smaller  5  occasionally  fails  to  find  a  better  direction.  In  the  example  shown  in  Figure  0-1 -a,  the 
curve  with  5=0.01  shows  slower  improvement  than  that  with  8=0.1  at  the  early  stage  of  the  itera¬ 
tions.  According  to  our  experiments,  5  less  than  0.1%  usually  works  well. 


Figure  0-2.  Avoidance  of  premature  convergence  around  a  kink 

With  a  small  h,  the  MFD  procedure  may  stall  at  a  kink  c  if  started  from  point  a  orb  (Figure  (a)).  This 
is  because  a  linear  approximation  by  a  small  h  around  point  c  cannot  take  into  account  a  large  change 
of  the  objective  function  value  at  d.  On  the  other  hand,  the  procedure  can  get  out  of  the  kink  with  a 
large  h,  by  one  step  from  a,  or  by  two  steps  from  b  (Figure  b). 


-  151- 


Appendix  P.  SVPR-KSP-LINE:  Solution  approach  - 
Lagrangian  relaxation  method 


As  will  be  discussed  in  Section  4.4.4.,  the  MFD  method  can  find  a  near-optimal  point  when  the 
network  load  is  not  heavy.  Considering  the  speed  of  the  solution  procedure,  the  proposed  algo¬ 
rithm  is  very  practical.  However,  the  optimality  degrades  as  the  load  grows  further.  Premature  con¬ 
vergence  around  a  kink  causes  this  problem,  and  a  larger  step  size  could  avoid  the  problem  as 
discussed  above.  Since  more  non-smooth  points  are  introduced  at  a  higher  load,  however,  prema¬ 
ture  convergence  becomes  inevitable  even  with  a  large  h,  causing  the  degradation  in  optimality. 

The  second  formulation  (SVPR-KSP-LINE2)  can  eliminate  the  non-smoothness  from  the  objec¬ 
tive  function.  Thus,  it  is  expected  to  reach  a  better  solution  at  a  higher  load.  However,  the  KSP 
constraints  (N-10)  are  too  complicated  to  be  tackled  directly.  In  order  to  overcome  the  difficulties, 
we  apply  the  Lagrangian  relaxation  method.  By  dualizing  complicated  constraints,  a  Lagrangian 
relaxation  method  could  simplify  the  solution  procedure  (see  Appendix  S  for  details).  The  tech¬ 
nique  would  also  provide  a  way  to  find  a  tighter  lower  bound  which  is  useful  in  evaluating  the  pro¬ 
posed  heuristics.  A  major  disadvantage  of  this  approach  is  that  it  would  require  considerably  more 
computation  time  due  to  the  increased  number  of  constraints. 

The  Lagrangian  relaxation  technique  seeks  a  better  solution  by  iteratively  solving  a  Lagrangian 
subproblem  and  a  Lagrangian  dual  problem.  The  problem  structure  heavily  depends  on  the  choice 
of  a  constraint  set  to  be  relaxed.  Out  of  several  possible  options,  we  relax  only  the  KSP  con¬ 
straints.  There  are  several  advantages  in  this  choice.  First  of  all,  this  has  led  to  the  best  solution  in 
terms  of  optimality  based  on  our  preliminary  experiments.  Secondly,  a  primally  feasible  solution 
can  be  readily  found  since  the  commodity  flow  acquired  in  the  Lagrangian  subproblem  is  also  pri¬ 
mally  feasible^^.  Finally,  a  similar  solution  procedure  to  the  SVPR-MF-LINE  problem  can  be 
adopted  by  relaxing  the  KSP  constraints  (Equation  (N-12)). 

For  a  given  multiplier  p  =  (p^ ®  ’  define  a  Lagrangian  function  3p  (x,  r,  t)  as: 


17.  A  Lagrangian  relaxation  method  requires  a  mechanism  to  find  a  primally  feasible  solution  since  it  generally  pro¬ 
duces  only  a  dually  feasible  solution.  If  a  feasible  commodity  flow  is  at  hand,  then  a  feasible  restoration  flow  and  a  fea¬ 
sible  lost  flow  can  be  readily  computed  through  Equations  (N-9)  and  (N-10). 
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3p  (X,  r,  t)  s 


ae  A 


l/f/’l 

ae  Ajt=  1 


”"{4-  S  v-'  tCr i«f“"‘’i}} 

1=1 

-  ) 


is  a  Lagrangian  multiplier  corresponding  to  the  constraint  where  KSP^°'^^  is 

the  KSP  constraint  for  arc  a  and  the  k-th  restoration  path  of  the  arc.  Now,  a  Lagrangian  subprob¬ 
lem  can  be  defined  as  follows: 


Minimize 

3p(?.  E.  i) 

(SVPR-KSP-LINE-LAG) 

over 

X,  r,  t  >  0 

subject  to 

a) 

Jt  Jt 

Vji  e  n 

(P-1) 

peP' 

b) 

Vo  e  A,  V/  e 

(P-2) 

c) 

Va  e  A 

(P-3) 

[«P"] 


Let  3  (p)  be  the  solution  to  the  above  minimization  problem.  Namely, 

3  (p)  =  min{3p  (x,  r,  t) :  (P-1),  (P-2)  and  (P-3)} 

Now  the  Lagrangian  dual  problem  is  stated  as  the  following  optimization  problem: 


Maximize  3  (p)  (SVPR-KSP-LINE-LAG-DUAL) 

over  p 

subject  to  p  >  0 


By  weak  duality  [4],  min  {L  ■  |£| }  >  max  {3  (p)  :p  ^  0}  .  The  subgradient  method  described 
in  Appendix  S.2  is  employed  to  update  the  multipliers  which  could  give  a  tighter  lower  bound. 
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Now  it  can  be  easily  shown  that  the  multiplier  is  monotonically  non-decreasing.  Thus,  p  should 

be  initialized  to  0  so  that  a  multiplier  could  reach  any  value  in  its  domain.  In  brief,  the  proposed 

solution  procedure  takes  the  following  steps: 

procedure  SyPR-KSP-UNE-LAG()  (brief  description) 

0.  Set  p  =  0 . 

1.  Solve  the  Lagrangian  subproblem  (SVPR-KSP-LINE-LAG). 

2.  Based  on  the  feasible  commodity  flow  assignment  x  obtained  in  Step  1,  calculate  a  feasible 
restoration  flow  r  and  a  lost  flow  t  through  Equations  (N-9)  and  (N-IO). 

3.  Perform  a  termination  test. 

4.  Solve  the  Lagrangian  dual  problem  (SVPR-KSP-LINE-LAG-DUAL)  by  the  subgradient 
method.  Update  the  Lagrangian  multiplier  p .  Go  to  Step  1. 

The  above  procedure  can  be  interpreted  as  follows:  The  solution  of  the  Lagrangian  subproblem 
at  the  first  iteration  is  same  as  the  optimum  of  the  SVPR-MF-LINE  problem  with  an  RP  option 
since  p  =  0 .  Due  to  the  difference  of  the  restoration  protocols,  the  optimum  of  the  SVPR-KSP- 
LINE  problem  would  be  somewhat  different  from  this  initial  solution.  Thus,  starting  from  the 
solution  to  the  MF-LINE-based  network,  the  algorithm  attempts  to  adjust  a  commodity  flow  to 
accommodate  this  difference.  From  the  viewpoint  of  the  problem  formulation,  this  difference 
comes  from  the  KSP  constraints.  Now,  if  a  constraint  is  violated,  the  subgradient 

method  increases  the  value  of  p^  ^  in  proportion  to  the  degree  of  the  violation.  In  other  words,  a 
penalty  is  imposed  on  violated  KSP  constraints  in  the  Lagrangian  subproblem  and  is  applied  in  the 
next  iteration.  With  the  aid  of  this  penalty  (Lagrangian  multiplier),  the  algorithm  updates  a  flow 
towards  the  optimum  for  the  KSP-based  network.  The  penalty  is  refined  over  iterations,  and  the 
procedure  SVPR-KSP-LINE-LAG  would  eventually  find  a  near-optimal  solution  to  the  SVPR- 
KSP-LINE  problem. 

Now  consider  the  Lagrangian  subproblem  (SVPR-KSP-LINE-LAG).  First  of  all,  define 
3p  (?,  r,  t,  y)  as  follows: 
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3.(5. t.ty)  •  2  'o 


+  E  'LKf 

a€  Ajfc=  1 


f  k-\  \  ,,  . 

^a,k'  f a~  ^  ^  '*’  ^  y  a,  k 


where  =  0  or  1 


(a  e  A,  ^  e  |/?pi, /t  G  {0, .... 


i>) 


Then,  3p  (x,  r,  t,  y)  can  be  transformed  into  the  following: 


|;fpP||;jp(|5.*)| 


3p(5,r,t,y)  =  ^  >  +  E  P«,t E  E  E  sp'M’ '  %  t ' ’’M 

ae  aI,  jt=  1  Pe  ^it=  1  A=  1  ’  7 


Z  S  Z  Pa,i-ya,i 

^ke  IRP"]'-  i  =  k+l 

f  |«P"| 


Pa,  I  ^0,1 


i=:k+l  /i  =  1 


Rpi>\  |pp<'’-')| 


i  =  Jt  +  1  A  =  1 


+  Z  Z  Z  ^aXy'aX^t 

ae  At=  1  /,  =  1 


where  0^  .  takes  one  if  a  =  b  and  zero  otherwise.  Define 

a,  D 


|/?p1  |rp^||«p'^'*’| 

Z  Pa,A-yfl,A-  Z  Z  Z 

Jfc=l  Pe>^A=lA=l  * 


(/=  {a,b}) 


Rl^-i-p  .-  y  p  ../. 

A  '^a,  A  Zj  i  •'^a,  j 

,  i  =  A  +  1 

|/?p1  |/tp‘‘''''’| 


I  =  A  +  1  A  =  1 


(/=  {a,fe}) 


^  |«p1  ^ 

'.*)  Pfj,  /  I  ^ RP^^'’^  ppt''.*)  Pa,  I  ^bji 


i  =  k+ I  A  =  1 
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^a,k  ^a,k 

aeAk= I  h=l  * 


Then, 


3p(?.!:.t,  y) 

1  <■ 

as  A  €[«?“] 

=  S  1  K- 

as  A  ^^^ks[RP“] 

^ppla.k)  +  C 

••a 

where  Ru  =  F 

^  a 

+  Now,  let  3  (p,  y)  denote  the  minimum  of  3  (x,  r,  t,  y)  for  given  y  and 

p .  Then,  3  (p,  y)  can  be  obtained  by  solving  the  following  optimization  problem  (SVPR-KSP- 

LINE-LAG2): 

Minimize 

E  +  S  E  ^  (SVPR-KSP-LINE-LAG2) 

as  A  '^ks  [RP"] 

over 

?.  €,  t  >  0 

subject  to  a) 

V'  Jt  JC 

L  =  ^ 

V7t6  n 

jyK 

psP 

b) 

fa^^a^<^a 

Va€  A,  vie  E 

a 

c) 

-fa+  E  =  0 

ks  [RP"] 

Vae  A 

Maximize 

Iten  asA  ^IsE  ^ 

a 

(SVPR.KSP-LINE-LAG2-DUAL) 

over 

a  ,  unrestricted,  ti  <  0 

subject  to  i) 

as  p^  Is  ' 

Vp  e  P^,  Vtc  G  n 

ii) 

V*  ^ - a 

hs  RP^"''"^ 

Vke  [RP‘‘],Vae  A 
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Va  G  A 


iii) 


w  <F^ 
a  a 


The  (SVPR-KSP-LINE-LAG2-DUAL)  shows  the  dual  formulation.  Due  to  the  similarity  of 
their  problem  structures,  the  above  problem  can  be  approached  in  a  similar  fashion  to  the  SVPR- 
MF-T  .TNF.  problem  with  an  RP  option.  The  pricing-out  operation  and  dual  feasibility  test  must  be 
slightly  modified  according  to  the  changes  in  the  dual  constraints. 


Let  KSP^  ^  denote  the  ^-th  term'®  in  the  argument  of  the  min  function  in  the  constraint 
e  {0, 1, ...,  j  (refer  to  Equation  (N-10)).  Now,  suppose  that  the  fol¬ 

lowing  inequality  holds: 


KSP 


h* 
a,  k 


<KSP 


Vh*h* 
a,  k 


Let  the  indicator  variables 


h  =  h*  ,  ^  h*h* 

ya,k  =landy. 


’a,  k 


take  the  following  values  on  such  an  occasion: 
=  0 


Then,  apparently  3p  (x,  r,  t,  y)  =  3p  (x,  r,  t) .  Therefore,  the  following  minimization  problem 
solves  the  Lagrangian  subproblem: 


Minimize 

3  (p.y) 

(SVPR-KSP-LINE-LAG3) 

over 

y 

subject  to 

y*  =  0  otherwise. 

A  local  minimum  for  the  above  minimization  problem  can  be  found  by  the  procedure,  SVPR- 
KSP-LINE-LAG-local. 


18.  It  is  assumed  to  be  counted  from  zero. 
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procedure  SVPR-KSP-LINE-LAG-local(y ) 

1.  For  a  given  y ,  obtain  3  (p,  y)  by  solving  the  problem  (SVPR-KSP-LINE-LAG2). 

2.  For  all  KSP  constraints  with  i.>0,  check  if  KSP’^  ,<  KSP^‘f^  with 

^  =  1 .  If  true,  go  to  Step  4. 

3.  Update  y  so  that  the  above  condition  is  met.  Go  to  Step  1. 

4.  If  there  is  i^h  with  KSP^  =  KSP'^  and  =  1  for  some  constraint  KSP 

with  p^  check  if  y^  =  1  won’t  reduce  the  objective  function  value.  If  the 

value  decreases,  update  y  and  go  to  Step  2. 

5.  Return  with  the  minimum  value  and  the  corresponding  value  of  y  and  x . 

Note  that  at  each  iteration,  the  objective  function  value  always  improves  since  p^  ^>0  and 

3i  ^  h  h  * 

KSP^  ^  <  KSP^  .  Therefore,  the  algorithm  terminates  with  a  finite  number  of  iterations  since 

the  combination  of  y  is  finite. 

The  minimum  calculated  in  the  above  manner  is  a  local  minimum  and  is  not  guaranteed  to  be  a 
global  one^®.  Therefore,  the  obtained  solution  is  not  assured  to  be  a  lower  bound  of  the  optimum  of 
the  SVPR-KSP-LINE  problem.  Although  a  global  optimum  may  not  be  acquired  in  general,  a  ran¬ 
dom  search  technique  of  the  global  optimization  can  be  applied  to  find  an  estimate  of  a  lower 
bound  [77].  For  a  given  multiplier,  the  technique  randomly  selects  multiple  starting  points  y  and 
obtains  multiple  local  minima.  By  increasing  the  number  of  sample  points,  the  lowest  value  among 
the  local  minima  would  converge  to  a  global  minimum  in  a  probabilistic  sense  [77].  The  procedure 
SVPR-KSP-LINE-LAG-global  embodies  this  random  search  technique: 

procedure  SVPR-KSP-UNE-LAG-global(bestLBD,bestUBD ) 

The  arguments  bestLBD  and  bestUBD  contain  the  best  lower  and  upper  bounds  ever  found. 

0.  Set  j  =  \  and  LBD  =  oo . 

1.  Randomly  generate  y . 

2.  Call  SVPR-KSP-LINE-LAG-local(y ).  Obtain  a  local  minimum,  Lminj,  a  corresponding 
commodity  flow  assignment  x ,  and  an  updated  y .  If  LBZJ>  Lminj,  then  set  LBD  =  Lminj. 

3.  Obtain  an  expected  lost  flow,  UBDj,  based  on  x  via  Equation  (N-1).  If  bestUBD  >  UBDj, 
then  set  bestUBD  =  UBDj. 

19.  Since  the  Lagrangian  subproblem  (SVPR-KSP-LINE-LAG)  is  a  concave  minimization  problem  with  a  convex  con¬ 
straint  set,  it  can  yield  many  local  minima  at  its  extreme  points. 
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4.  Perform  termination  test. 

a)  If  j  <  MinjGlobalJter ,  then  j<-j+l  and  go  to  Step  1 . 

b)  If  LBD  <  bestLBD ,  go  to  Step  5. 

c)  If  j  >  MaxjGlobalJter ,  then  go  to  Step  5. 

d)  Otherwise,  j  «-  7  +  1  and  go  to  Step  1. 

5.  Return  with  bestUBD,  LBD,  and  y  which  gives  LBD. 

In  the  above  procedure,  at  least  Min_Global_Iter  random  samples  are  generated,  and  the  proce¬ 
dure  stops  with  at  most  MaxjGlobalJter  samples.  The  global  optimization  procedure  SVPR- 
KSP-LINE-LAG-global  involves  extensive  computation  if  a  good  estimate  of  the  lower  bound  is 
necessary.  The  values  of  MinjGlobalJter  and  MaxjGlobalJter  must  be  sufficiently  large^®  for 
this  purpose. 

Nonetheless,  the  approach  based  on  the  SVPR-KSP-LINE2  formulation  is  very  useful  in  obtain¬ 
ing  a  good  approximation  of  an  optimal  solution.  If  the  objective  is  to  find  a  primal  solution,  not  a 
lower  bound,  then  a  good  estimate  of  a  globally  optimal  solution  is  not  necessary  in  the 
Lagrangian  subproblem.  Therefore,  MinJjlobalJter  and  MaxjGlobalJter  need  not  have  large 
values.  In  our  experiments,  they  are  set  to  20  and  30,  respectively.  Furthermore,  the  global  optimi¬ 
zation  procedure  is  not  even  necessary  at  all  iterations.  Instead,  it  would  suffice  to  call  the  process 
only  occasionally,  say  only  when  a  local  minimum  becomes  too  close  to  or  beyond  the  best  upper 
bound  found  so  far^^  As  the  procedure  converges,  it  is  expected  that  a  commodity  flow  assign¬ 
ment  tends  to  be  near-optimal.  This  procedure  typically  produces  a  better  solution  than  the  MFD 
method  since  it  is  immune  to  the  issue  arising  from  the  kinks. 


20.  In  order  to  reduce  the  sample  points,  a  procedure  proposed  by  Golden  et.al.  could  be  applied  [32]  [33].  It  is  based  on 
a  statistical  extreme- value  theory:  Assuming  randomness  of  the  samples  and  a  Weibull  distribution  for  their  limiting  dis¬ 
tribution,  it  calculates  a  point  estimate  and  a  confidence  interval  for  a  global  optimum.  The  above  assumption  is  empiri¬ 
cally  justified  by  applying  it  to  a  large-scale  travelling  salesman  problem. 

21.  Namely,  a  global  optimization  procedure  is  not  invoked  until  a  Lagrangian  multiplier  approaches  a  near-optimal 
point  in  the  Lagrangian  dual  problem.  If  a  local  minimum  becomes  too  close  to  or  beyond  an  upper  bound  estimate,  then 
either  a  Lagrangian  multiplier  is  near-optimal  in  the  Lagrangian  dual  problem,  or  the  multiplier  deviates  from  the  opti¬ 
mum  due  to  a  non-trivial  gap  between  the  local  and  global  minima  in  the  Lagrangian  subproblem.  In  the  former  case,  we 
can  terminate  the  procedure.  In  the  latter  case,  however,  it  is  necessary  to  find  a  better  local  minimum  to  adjust  the  mul¬ 
tiplier  in  the  next  iteration  of  the  Lagrangian  dual  problem.  In  either  case,  a  global  optimization  procedure  should  be 
invoked  to  check  if  adjustment  of  a  multiplier  is  necessary  or  not.  However,  a  thorough  random  search  is  not  required  on 
this  occasion.  If  the  gap  is  large,  then  a  better  local  minimum  can  typically  be  found  in  a  few  random  searches.  Even  if  a 
global  optimum  may  not  be  found,  it  is  sufficient  to  perform  a  few  random  searches  in  order  to  adjust  a  multiplier.  Thus, 
MinJjlobalJter  and  MaxJJlobalJter  need  not  have  large  values. 
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The  proposed  algorithm  is  summarized  as  follows,  which  we  call  the  LAG  method: 

procedure  SVPR-KSP-LINE-LAG() 

0.  Initialize  y .  Set  p  =  0 ,  UBD  =  ,  LBD  =  0  and  j  =  I . 

1.  Call  SVPR-KSP-LINE-LAG-local(y ).  Obtain  a  local  minimum,  Lmitij,  a  corresponding 
commodity  flow  assignment  x ,  and  an  updated  y . 

2.  Obtain  an  expected  lost  flow,  UBDj,  based  on  x  via  Equation  (N-1). 

If  UBD  >  UBDj,  then  set  UBD  =  UBDj. 

3.  If  ( UBD  -  Lmittj)  /  ( UBD)  <  ,  then 

a)  Call  SVPR-KSP-LINE-LAG-global(LBD,UBD)  and  obtain  an  estimated  lower  bound 
ELBDj  and  the  best  upper  bound  BUBDj  encountered  during  the  procedure. 

b)  If  UBD  >  BUBDj,  then  set  UBD  =  BUBDj. 

c)  If  LBD  <  ELBDj,  then  set  LBD  =  ELBDj. 

4.  Stop  if  either  of  the  following  hold. 

a)  ( UBD  -  LBD)  /  ( UBD)  < 

b)  The  step  length  given  by  Equation  (S-7)  on  page  168  is  less  than  . 

5.  Solve  the  Lagrangian  dual  problem  through  the  subgradient  method.  Calculate  the  subgradi¬ 
ent  and  update  the  Lagrangian  multiplier  and  step  length. 

6.  j  <—  y  +  1 .  Go  to  Step  1. 


Appendix  Q.  Quadratic  Shortest  Path  (QSP)  Algorithm 


The  quadratic  shortest  path  (QSP)  algorithm  solves  the  following  problem: 

Find  the  shortest  path  from  a  source  S  to  a  destination  D  in  the  bidirectional  network  G 
where  each  path  is  subject  to  two  types  of  arc  cost:  The  independent  arc  cost,  {c^}  ,  is 
imposed  if  a  path  goes  through  arc  a,  and  the  non-positive  mutual  arc  cost,  {m  ,  is 
added  if  a  path  contains  both  arc  a  and  link  /. 

In  the  SCFA-MF-ETE  problem,  and  ^  equal  d^  +  ^  ^  ^  and  ,  respectively,  while  in 
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the  SVPR-MF-ETE  problem,  the  two  costs  are  equal  to  and  pj,,  respectively. 

The  proposed  QSP  algorithm  works  if  the  following  three  conditions  are  met,  which  can  be  proven 
in  the  cases  of  the  SCFA-MF-ETE  and  SVPR-MF-ETE  problems: 
c  >  0  for  Va  e  A , 

a 

m  ,  <  0  for  Va  e  A  and  V/  e  E  , 

a,  l  “ 

X  A. 

The  algorithm  stems  from  the  idea  of  the  branch  and  bound  technique  [81].  It  searches  all  possi¬ 
ble  paths  via  a  depth-first  search.  Along  with  the  search,  unnecessary  paths  are  eliminated  from 
consideration  by  using  the  best  upper  bound  information  obtained  so  far. 

QSP  algorithm 

procedure  main() 

1)  Find  the  shortest  paths  from  all  nodes  to  D,  using  {  as  an  arc  cost.  Let  d(v)  denote 
the  minimum  cost  from  a  node  v  to  D. 

2)  Let  P  be  the  shortest  path  from  S  to  D  found  in  the  previous  step.  Set 

X  X  "»«./• 

ae  P  ae  Pie  P 

3)  Mark  S  as  ‘visited’ .  Mark  all  other  nodes  as  ‘unvisited’ . 

4)  Call  vwj'r fS,  0,  0] 

5)  The  solution  is  given  by  P  (the  shortest  path)  and  UB  (its  cost). 

procedure  visit  (vertex,  path,  cost) 

1)  Mark  vertex  as  ‘visited’. 

2)  For  all  arc  a  adjacent  to  vertex, 

2-1)  If  endjnode  (a)  is  marked  as  ‘visited’,  continue  to  the  next  arc. 

2-2)  If  new_cost  +  d  (endjnode  (a)  )>UB,  continue  to  the  next  arc,  where  new_cost 
is  given  by 

new_cost  =  cost +  C^+  , a  +  X  /"a, / 

b  e  path  I  e  path 

2-3)  If  end_node  (a)  =  D,  then  set  UB  =  new_cost  and  P  =  pathu  {a}  Continue 
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to  the  next  arc. 

2-4)  Call  visit  (endjnode  (a) ,  path  u  {a}  ,  new_cost). 

3)  Mark  vertex  as  ‘unvisited’ . 

Note  that  all  variables  except  for  the  arguments  of  the  procedure  visit  are  global  variables.  The 
function  endjnode  (a)  returns  the  end  node  of  directed  arc  a.  Dijkstra’s  algorithm  can  be  applied 
at  step  1)  in  the  procedure  main()  because  C^'  is  assumed  to  be  non-negative.  The  procedure  visit 
performs  the  depth-first  search.  UB  provides  the  best  upper  bound  of  the  shortest  path  length  from 
5  to  D  ever  found.  If  a  path  is  routed  over  arc  a,  at  least  is  imposed  on  the  path.  Thus,  d(v) 
gives  the  lower  bound  of  the  cost  from  v  to  D.  This  lower  bound  is  used  to  eliminate  an  unneces¬ 
sary  path  search  at  step  2-2. 


Appendix  R.  Initialization  Procedures  for  the  SVPR  problems 


The  primal  simplex  algorithm  requires  an  initial  basic  feasible  solution.  Although  a  feasible 
solution  can  be  readily  obtained  for  the  SCFA-MF  problems,  a  feasible  commodity  flow  is  not 
always  apparent  for  the  SVPR-MF  problems.  The  two-phase  approach  for  the  simplex  algorithm 
should  be  applied  if  a  feasible  solution  is  not  immediately  available  [54].  This  appendix  gives  the 
details  on  the  initial  procedures  for  the  SVPR-MF-LENE  problem  and  the  SVPR-MF-ETE  prob¬ 
lem. 

For  each  problem,  the  initial  procedure  takes  the  following  steps: 

Step  1.  Obtain  a  shortest  route  commodity  flow.  If  it  is  feasible,  go  to  Step  3. 

Step  2.  Find  an  initial  feasible  commodity  flow  by  solving  an  LP  problem  with  artificial  vari¬ 
ables. 

Step  3.  Obtain  an  initial  feasible  solution  to  the  SVPR  problem. 

Step  4.  Restore  a  flow  over  the  candidate  restoration  paths  one  by  one. 

The  last  step  is  not  mandato^.  However,  this  procedure  helps  to  reduce  a  lost  flow  without  incur¬ 
ring  simplex  iterations,  which  significantly  shortens  the  computation  time.  Candidate  restoration 
paths  can  be  taken  from  the  ones  appearing  in  the  optimal  solution  upon  the  network  design  (the 
solution  to  the  SCFA-MF  problems).  If  the  speed-up  version  of  the  SVPR-MF  solution  procedures 
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is  engaged,  further  candidate  paths  are  generated  from  the  beginning.  See  Section  4.2.4.  and  Sec¬ 
tion  4.3.4.  for  details. 


R.I.  Initialization  procedure  for  SVPR-MF-LINE 

If  a  shortest  route  flow  is  not  feasible,  then  the  following  auxiliary  linear  programming  problem 
must  be  solved  to  find  an  initial  feasible  basic  solution  (Step  2).  Let  a  =  (a^)  (a  e  A)  be  a  vec¬ 
tor  of  artificial  variables  where  is  equal  to  the  amount  of  violation  on  the  capacity  constraint  for 
arc  a. 


Minimize 

ae  A 

over 

X  >  0  and  a  >  0 

subject  to 

X"'  It  It 

Vti  e  n 

(R-I) 

peP” 

X  %,p'^p~^a-^a 
pB  P 

Vfl  6  A 

(R-2) 

Initial  basic  variables  for  this  auxiliary  problem  consist  of  one  commodity  flow  variable  Xp  per 
commodity  (based  on  a  shortest  route  flow),  and  either  a  slack  variable  or  an  artificial  variable 
per  arc  (depending  on  the  conformity  to  the  capacity  constraint).  Note  that  we  don’t  have  to 
generate  artificial  variable  if  >  0  in  the  first  place.  Furthermore,  once  becomes  zero,  it 
can  be  removed.  The  problem  is  infeasible  if  the  optimum  solution  of  the  above  problem  is  non¬ 
zero. 

The  solution  procedure  to  this  auxiliary  problem  is  again  decomposed  into  a  master  process  and 
a  sub-process.  The  master  process  checks  to  see  if  the  current  solution  is  globally  optimal.  Appar¬ 
ently,  if  all  artificial  variables  are  out  of  the  basis,  the  solution  is  optimal  and  the  procedure  termi¬ 
nates.  Otherwise,  the  following  dual  feasibility  condition  must  be  tested  in  order  to  find  paths  to 
generate: 


- 163- 


“  S  ^ 

ae  p 

where  and  il^  are  the  simplex  multipliers  corresponding  to  the  constraints  (R-1)  and  (R-2), 
respectively.  An  all-to-all  shortest  path  algorithm  [4]  can  be  employed  to  find  the  shortest  paths  for 
all  commodities  under  the  arc  cost  {-p^  >  0}  .  Generate  the  shortest  path  variables  violating  the 
condition  (R-3).  If  the  condition  is  satisfied  for  all  commodities,  the  solution  is  optimal  for  the 
auxiliary  problem,  implying  that  no  feasible  commodity  flow  exists. 

The  sub-process  employs  the  revised  simplex  algorithm  and  finds  an  optimal  solution  using  only 
the  generated  columns.  Since  the  solution  becomes  optimal  if  all  artificial  variables  become  non- 
basic,  the  process  attempts  to  select  an  artificial  variable  as  a  leaving  variable  during  the  course  of 
simplex  iterations.  The  basis  matrix  can  be  arranged  and  decomposed  as  follows,  which  could 
reduce  the  computation  time  of  the  simplex  algorithm: 


I  c 

/ 

7  C 

A,  -7  Mj 

Ai  7 

-7  Afj-AjC 

A^  I 

A2  7 

7  M^-A^C 

A3  M3 

U 

where  LU  =  M2-A2C.  The  first  set  of  columns  corresponds  to  key  path  flow  variables,  the  sec¬ 
ond  set  to  artificial  variables,  the  third  set  to  slack  variables  and  the  last  set  contains  non-key  path 
flow  variables. 

After  a  feasible  commodity  flow  is  found,  we  must  build  an  initial  feasible  solution  to  the 
SVPR-MF-LINE  problem  (Step  3).  The  following  procedures  achieve  this  task. 

•  A  key  commodity  flow  variable  remains  a  key  with  the  same  amount  of  flow. 

•  All  non-key  basic  commodity  flow  variables  also  remain  the  same. 

•  All  restoration  flow  variables  are  non-basic. 

•  All  lost  flow  variables  (\/a  e  A  )  are  in  the  basis,  and  the  value  is  obtained  by  the  resto¬ 
ration  flow  constraints. 

•  For  a  basic  slack  variable  s  ,  create  basic  slack  variables  for  V/  e  £  with  =  s  . 

“  a  a  a  a 

•  For  a  non-basic  slack  variable  s  ,  produce  basic  slack  variables  =  0  for  all  /  e  £  but 

^  a  a 

one  arbitrary  /'  e  E^. 
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R.2.  Initialization  procedure  for  SVPR-ME-ETE 


The  initial  procedure  is  the  same  as  that  for  the  SVPR-MF-LINE  problem  described  above, 
except  for  Step  3.  An  initial  feasible  basic  solution  to  the  SVPR-MF-ETE  problem  can  be  settled 
as  follows: 

•  First,  generate  all  (7C,  /)  rows  pertaining  to  commodity  flow  variables  generated  in  Step  1  and 
Step  2. 

•  A  key  commodity  flow  variable  remains  a  key  with  the  same  amount  of  flow. 

•  All  non-key  basic  commodity  flow  variables  also  remain  the  same. 

•  Lost  flow  variables  for  all  generated  (7t,  /)  rows  are  in  the  basis,  and  the  value  is 
obtained  from  the  restoration  flow  constraints.  All  other  lost  flow  variables  are  undefined. 

•  All  restoration  flow  variables  are  non-basic. 

•  Calculate  a  value  of  slack  for  all  capacity  constraints. 

•  For  each  non-key  basic  commodity  flow,  pick  up  one  slack  variable  with  =  0  . 

•  Place  all  slack  variables  but  the  ones  selected  in  the  previous  step  in  the  basis. 


Appendix  S.  Lagrangian  Relaxation  and  Subgradient  Method 


This  appendix  briefly  reviews  the  Lagrangian  relaxation  technique.  It  has  been  applied  to  a  wide 
range  of  optimization  problems  in  order  to  make  them  computationally  tractable.  This  appendix 
also  describes  the  subgradient  method  which  has  been  successfully  employed  in  the  Lagrangian 
dual  problem. 

S.l.  Lagrangian  relaxation 


The  idea  of  relaxation  is  to  replace  the  optimization  problem  by  an  easier  problem.  The  relax¬ 
ation  technique  can  be  used  to  obtain  a  lower  bound  in  case  of  a  minimization  problem.  Further¬ 
more,  it  could  give  a  way  to  produce  a  good  approximation  of  the  original  problem.  Now,  a 
relaxation  is  defined  as  follows  [63].  Consider  the  minimization  problem  (P): 

(P)  Zp  =  min  {/  (x)  :x  e  X} 
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Then,  a  minimization  problem  (RP)  satisfying  the  conditions  (S-1)  and  (S-2)  is  called  a  relaxation 
problem  of  (P). 


{RP)  Zrp  =  min{/pp(x):xG  Xpp} 

Xrp^X 

(S-1) 

fpp  (x)  <  /  (x)  for  Vx  e  X 

(S-2) 

If  (RP)  is  feasible,  the  following  relation  holds: 

Zrp  <  Zp 

(S-3) 

The  simplest  relaxation  method  is  a  constraint  relaxation  where  some  of  the  constraints  are 
removed  while  (x)  =  /  (x)  .  The  above  two  conditions  are  clearly  satisfied. 

A  Lagrangian  relaxation  technique  also  removes  constraints  from  the  original  problem.  Typi¬ 
cally  complicated  constraints  are  relaxed  to  simplify  the  problem.  Then  each  relaxed  constraint  is 
added  into  the  objective  function  after  being  multiplied  by  a  distinct  Lagrangian  multiplier.  Let 
X  =  {x-.xe  Xj^p,g.  (x)  <  0,  j  e  /}  ,  where  g^  (x)  <0,ie  I  represents  a  complicated  constraint 
to  be  relaxed.  Then,  apparently  condition  (S-1)  holds.  The  following  definition  of  /^p  (x)  also 
satisfies  condition  (S-2): 

/rp  (x)  s  /  (x)  +  2,  •  8i  (x)  and  p.  >  0  for  V/  €  /  (S-4) 

i  e  I 

where  p  =  (pp ,  i  e  I  is  a  Lagrangian  multiplier.  Define  a  Lagrangian  function  3  (p)  as: 

3(p)  smin{/pp(x):xe  Xpp}  (S-5) 

This  minimization  problem  is  called  as  a  Lagrangian  subproblem.  The  Lagrangian  function  has 
been  proven  to  be  a  non-smooth  concave  function  of  p  and  is  guaranteed  to  have  a  subgradient 
[60].  From  (S-3),  3  (p)  is  assured  to  be  a  lower  bound  of  the  problem  (P)  for  any  non-negative  p. 
The  tightest  lower  bound  can  be  found  by  maximizing  the  Lagrangian  function  over  p.  This  maxi¬ 
mization  problem  leads  to  the  following  Lagrangian  dual  problem: 

3*smax{3(p):p>0}  (S-6) 

The  subgradient  algorithm  described  in  the  next  section  can  be  applied  to  the  Lagrangian  dual 
problem.  By  iteratively  solving  the  Lagrangian  subproblem  smd  the  Lagrangian  dual  problem,  we 
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can  refine  the  value  of  \l  over  iterations  and  obtain  a  tighter  lower  bound.  The  Lagrangian  relax¬ 
ation  technique  has  been  also  employed  to  discover  a  good  primal  solution  to  problems  with  com¬ 
plicated  constraints.  Theoretically,  it  could  eventually  find  an  optimal  solution  if  there  is  no  duality 
gap.  Although  this  may  not  be  attained  in  practice,  a  solution  of  the  Lagrangian  subproblem  could 
be  utilized  to  find  an  approximate  solution  to  the  original  problem  [4].  Starting  from  a  dually  feasi¬ 
ble  solution,  a  primally  feasible  solution  could  be  obtained  with  the  aid  of  heuristics.  The  develop¬ 
ment  of  such  heuristics  and  the  selection  of  a  set  of  constraints  to  be  relaxed  are  two  major  design 
issues  in  the  Lagrangian  relaxation  technique  since  they  totally  depend  on  the  problem  context. 

S.2.  Subgradient  method 

The  subgradient  method  is  developed  for  a  concave  (convex)  non-differentiable  optimization 
problem  [74].  The  scheme  has  been  successfully  applied  to  the  Lagrangian  dual  problem  [4]  [63]. 
Now  the  Lagrangian  relaxation  problem  can  be  solved  by  the  following  steps  based  on  the  subgra¬ 
dient  method  (Steps  2,  3  and  4): 

0.  Obtain  an  initial  value  of  multiplier  p,  >  0 .  =  0  is  typically  used.  Set ;  =  1 . 

1.  For  a  given  p,  solve  the  Lagrangian  subproblem  3  (p)  =  min  {/^p  (x)  :x  e  Xpp} .  Sup¬ 
pose  that  X*  solves  this  problem. 

2.  Find  any  subgradient  of  3  (p)  at  x*  .  {g^  (x*)  :  (/  el))  can  be  used  [60]. 

3.  Update  a  multiplier  as  follows: 

p.  <-  [p.  +  0-'  •  g.  (x*)  ] 

where  [a]  ^  denotes  a  non-negative  part  of  a,  and  6^  is  a  step  length  at  they-th  iteration. 

4.  Perform  a  termination  test^^.  If  not  satisfied,  then  set  y  <-  ;  +  1  and  go  back  to  Step  1. 

Theoretically,  the  subgradient  method  converges  if  the  step  length  is  obtained  from  any 
sequence  satisfying  0^'  0  and  ~  [74].  For  example,  0''  =  1/y  satisfies  this  condition. 

However,  several  heuristics  have  been  developed  and  widely  applied  in  practice  for  step  length 
selection  in  order  to  accelerate  the  convergence  speed  [4]  [31]  [39].  Based  on  our  experiment,  the 
heuristic  developed  by  Held  et.  al.  [39]  turns  out  to  have  a  good  convergence  property  in  the 
SVPR-KSP-LINE2  problem.  We  recite  their  method  in  the  following.  The  step  length  is  computed 


22.  For  example,  the  procedure  stops  when  a  step  length  becomes  sufficiently  small. 
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by: 


qJ  =  [UB-Zi\i)] 


where  UB  is  the  best  upper  bound  of  the  objective  function  found  so  far.  X’  is  obtained  as  follows: 

•  TJ  =  1  for  j  up  to  2n,  where  n  is  a  measure  of  the  problem  size.  We  make  n  equal  to  the 
number  of  constraints. 

•  Both  .  and  the  number  of  iterations  are  halved  until  the  number  of  iterations  reaches  some 
threshold  value,  z. 

•  Then  is  halved  every  z  iterations  until  the  resulting  is  sufficiently  small. 


Appendix  T.  Notations 


This  final  appendix  summarizes  the  notations  used  throughout  the  project. 


T.l.  General 


G  =  {V,A,c) 

V 

A 


?  =  (cj  ae  A 
E 


E^=  {le  E:a€l} 

n 


:  A  network. 

:  A  set  of  vertices  (nodes)  representing  ATM  switches. 

:  A  set  of  (directed)  arcs  representing  optical  trunks. 

:  A  vector  of  arc  capacity. 

:  A  set  of  (undirected)  links.  Each  link  is  composed  of  two 
directed  arcs  with  the  same  end-nodes  but  in  opposite 
directions. 

:  A  set  of  links  excluding  the  one  containing  arc  a. 

:  A  set  of  commodities.  A  commodity  is  a  traffic  flow  from 
an  origin  to  a  destination.  One  commodity  is  defined  for 


- 168- 


pit 

RP^ 


Q  =  iq")  Ke  n 


each  origin  and  destination  pair. 

:  A  set  of  all  possible  routes  for  commodity  7t  e  FI . 

:  A  set  of  all  possible  or  all  candidate  restoration  routes  for 
arc  a  G  A.  (LINE  related) 

:  A  set  of  all  possible  or  all  candidate  restoration  routes  for 
commodity  7t  €  IT  against  a  failure  of  link  le  E.  (EXE 
related) 

:  A  vector  of  the  requested  bandwidth  for  each  commodity. 


f  =  (/")  (Vtce  n,Vae  A) 
f^iyaGA) 


A  commodity  arc  flow  vector. 

The  amount  of  an  aggregate  flow  of  arc  a. 

/  =  y  f 


X  =  (xp  (Vji€  n,Vpe  P") 
r  =  (rp  (Vae  A.Vpe  PP“) 


A  commodity  path  flow  vector. 

An  arc  restoration  path  flow  vector.  (LINE  related) 

A  restoration  path  flow  vector,  tig  D,  /  e  E,  pG  RP 
(ETE  related) 


r  Va  G  A,  V/  e  E 

a  ’  a 


:  The  amount  of  bandwidth  of  arc  a  released  by  affected 
VP’s  due  to  a  failure  of  link  /.  (ETE  related) 


s[  (VflG  A,V/e£:j 


;  The  slack  variable  of  the  capacity  constraint  for  arc  a 
upon  a  failure  of  link  /. 

:  An  arc-path  indicator  variable  which  equals  1  if  arc  a  is 
contained  in  path  p,  and  0  otherwise. 

:  An  indicator  variable  which  equals  1  if  both  arc  a  and 
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link  /  are  contained  in  path  p,  and  0  otherwise. 


g"  (Vice  n) 

(Vfl  e  A) 

ill  (\/aeA,\/leEJ 
Vjte  n,V/e  E 

B 

\X\ 

An  (w,  m)  network 
<a,  b> 

a-b-c 

T.2.  SCFA  related 

(Va  e  A) 

?  =  (Zfl)  (Va  e  A) 
(Va  e  A,  Vie  EJ 


:  The  price  of  commodity  (dual  variable). 

:  The  price  of  arc  restoration  (dual  variable). 

:  The  price  of  arc  per  failure  (dual  variable). 

:  The  price  of  link  restoration  per  commodity  (dual  vari¬ 
able). 

;  Basis  matrix 
:  Cardinality  of  a  set  X. 

;  A  network  with  n  vertices  and  m  arcs. 

:  A  commodity  from  a  source  node  a  to  a  destination  node 
b. 

:  A  path  from  node  a  to  c  through  b. 


:  A  unit  arc  installation  cost.  It  is  assumed  to  be  non-nega¬ 
tive. 

:  A  spare  capacity  vector. 

:  The  amount  of  necessary  spare  bandwidth  of  arc  a  upon  a 
failure  of  link  /.  (SCFA  only) 
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T.3.  SVPR  related 


t  =  (g  (Vae  A) 
t  =  (?"’')  VTce  n,V/e  E 
z^(Vae  A.V/e£^) 

(Va  e  A,  \/le  E) 

L(i) 

Ltd) 

^v,  a 

Jt 

y{l,k) 

Rp^-.IC) 

h 

Rp^^<k) 


:  An  arc  lost  flow  vector.  (LINE  related) 

:  A  lost  flow  vector.  (ETE  related) 

:  The  amount  of  total  restoration  flow  of  arc  a  upon  a  fail¬ 
ure  of  link  /.  (SVPR  only) 

:  The  price  of  arc  per  failure  (dual  variable).  (ETE  related) 

:  An  expected  lost  flow  given  a  flow  f . 

:  A  lost  flow  due  to  a  failure  of  link  /  given  a  flow  f . 

:  A  node-arc  incidence  variable  which  equals  1  if  arc  a 
leaves  node  v,  -1  if  arc  a  enters  node  v,  and  0  otherwise. 

:  An  indicator  variable  which  takes  1  if  v  is  an  originating 

node  of  commodity  ;c,  -1  if  v  is  a  destination  node  of  Jt, 
and  0  otherwise. 

:  The  residual  capacity  of  arc  p  after  KSP  restoration  using 
the  first  k  restoration  paths  upon  a  failure  of  link  /. 

:  The  k-th  restoration  path  for  arc  a.  A  set  of  arcs  over  the 
path. 

:  The  step  size  to  numerically  calculate  one-sided  partial 
derivatives.  (MFD  related) 

:  The  k-th  arc  in  the  k-th  restoration  path  for  arc  a. 

:  The  KSP  constraint  for  arc  a  and  its  k-th  restoration  path 
for  (SVPR-KSP-LINE2). 
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P  = 


:  Lagrangian  multiplier  for  each  KSP  constraint. 


3p(x,r,t) 


:  Lagrangian  function. 
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